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Abstract 

Optimization  of  scheduled  arrival  times  to  an  appointment  system  is  approached 
from  the  perspectives  of  both  queueing  and  scheduling  theory.  The  appointment 
system  is  modeled  as  a  single-server,  first-come-first-served,  transient  queue  with 
independent,  distinctly  distributed  service  times  and  no-show  rates.  If  a  customer 
does  show,  it  is  assumed  to  be  punctual.  The  cost  of  operating  the  appointment 
system  is  a  convex  combination  of  customers’  waiting  times  and  the  server’s  overtime. 
While  techniques  for  finding  the  optimal  static  and  dynamic  schedules  of  arrivals 
have  been  proposed  by  other  researchers,  they  mainly  have  focused  on  identical 
customers  and  strictly  punctual  arrivals.  This  effort  provides  substantially  more 
efficient  solution  methods,  addresses  a  more  general  cost  function,  allows  for  no- 
shows  and  non-identical  service  distributions,  and  applies  either  when  arrivals  are 
constrained  to  lattice  points  or  when  they  are  unconstrained.  Because  customers 
are  not  indistinguishable,  this  effort  also  provides  heuristics  for  determining  optimal 
customer  order.  The  proposed  techniques  apply  to  any  piecewise  convex,  submodular 
function. 
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SCHEDULING  AND  SEQUENCING  ARRIVALS  TO  A 
STOCHASTIC  SERVICE  SYSTEM 


I.  Introduction 

1.1  Motivation 

Appointment  systems  have  flourished  over  the  last  century.  With  widespread 
availability  of  the  telephone  and  other  forms  of  improved  communications  has  come 
the  realization  of  efficiencies  due  to  scheduling  that  we  often  take  for  granted.  For 
example,  no  longer  do  most  people  simply  appear  at  a  doctor’s  office  for  routine 
medical  problems  and  wait,  which  was  the  rule  in  many  societies  even  as  late  as  the 
1960s  [45,  69].  Instead,  an  appointed  time  of  arrival  is  agreed  upon,  many  times 
by  phone,  and  the  server  makes  an  implied  commitment  to  attend  to  the  customer 
as  soon  thereafter  as  the  service  protocol  permits.  This  is  the  situation  today  for 
numerous  personal  services.  It  is  also  the  practice  in  many  industrial  settings,  such 
as  the  scheduling  of  cargo  ships  at  port  facilities,  the  scheduling  of  part  deliveries  in 
just-in-time  systems  and  the  scheduling  of  customers  at  military  testing  and  training 
facilities. 

Customers  benefit  greatly  from  such  an  arrangement,  since  their  waiting  times 
(as  measured  from  their  scheduled  arrivals)  are  usually  both  smaller  and  less  variable 
under  a  scheduling  system.  Servers  incur  costs  due  to  the  creation  and  maintenance 
of  the  scheduling  system,  the  loss  of  some  flexibility  in  operation,  and  the  potential  for 
increased  server  idle  time.  On  the  other  hand,  servers  benefit  from  requiring  smaller 
facilities  for  holding  queueing  customers  and  from  increased  customer  satisfaction. 
Since  customers  and  servers  in  general  have  different  goals,  the  scheduling  systems 
that  optimize  their  interests  will  in  general  differ. 
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The  major  goal  of  this  dissertation  is  to  establish  methods  of  optimizing  mea¬ 
sures  of  performance  that  characterize  the  interests  of  all  parties,  given  various  cir¬ 
cumstances.  The  interest  of  each  customer  is  taken  solely  as  the  minimization  of 
its  expected  waiting  time.  The  interest  of  the  server  is  to  minimize  overtime.  The 
cost  of  the  system  is  assumed  to  be  a  convex  combination  of  the  waiting  times  and 
overtime. 

Such  a  formulation  has  particular  value  when  both  the  customers  and  server 
belong  to  the  same  organization,  as  in  the  case  of  an  Air  Force  aerial  combat  range 
or  military  dental  clinic.  For  such  cases,  the  distinctions  between  server  cost  and 
customer  cost  blur,  and  there  is  greater  certainty  in  the  unit  costs  for  each  entity. 
For  instance,  it  is  possible  to  put  a  precise  cost  on  the  waiting  time  of  each  patient, 
each  medical  provider,  and  on  facility  availability  for  a  military  medical  clinic,  since 
the  government  incurs  a  calculable  cost  for  each.  The  quantitative  importance  of 
serving  a  maximal  number  of  customers  may  still  have  to  be  left  as  a  value  judgment, 
however;  for  instance,  how  does  one  unambiguously  determine  the  monetary  value 
to  the  military  of  a  bombing  training  mission  when  scheduling  a  range,  or  the  cost  to 
society  of  a  sick  patient  who  can  not  work  and  must  receive  unemployment  benefits 
as  a  result  of  delayed  medical  care  [35]? 

More  difficult  are  situations  in  which  the  respective  costs  are  incurred  by  dis¬ 
tinct  organizations  -  say,  the  use  of  one  of  the  U.S.  Air  Force’s  Air  Combat  Maneu¬ 
vering  Instrumentation  (ACMI)  range  facilities  by  a  foreign  air  force,  or  the  use  of 
a  military  medical  facility  by  someone  not  on  active  duty.  In  such  cases,  subjective 
judgments  must  be  applied,  such  as  “a  doctor’s  time  is  37.5  times  more  valuable 
than  the  patients’  ”  [7]. 

A  secondary  goal  of  this  dissertation  is  to  show  the  applicability  of  the  solution 
techniques  developed  here  to  other  problems.  It  will  be  shown  that  the  cost  function 
arising  from  this  optimization  problem  is  related  in  structure  to  a  large  class  of 
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problems  in  physics  and  resource  allocation,  and  as  a  result,  the  solution  methods 
may  profitably  be  applied  to  these  problems. 

With  these  aims  in  mind,  a  more  formal  description  of  the  problem  is  considered 

next. 

1.2  Problem  Description 

A  (fixed)  set  of  N  customers  is  to  be  assigned  arrival  times  to  a  single  server. 
Each  customer  arrives  precisely  on  time  unless  failing  to  show  altogether.  The  prob¬ 
abilities  for  each  customer  showing  are  known.  Service  time  probability  density 
functions  (PDFs)  are  known  and  are  independent  of  each  other,  of  the  show  prob¬ 
abilities,  and  of  the  scheduled  arrival  times.  The  cost  of  operating  the  system  is 
defined  to  be  a  convex  combination  of  the  individual  expected  waiting  times  and  the 
server  overtime.  Figure  1  depicts  the  situation.  Here,  n  is  the  scheduled  arrival  time 
of  customer  i,  while  \i  and  W \  are  the  service  duration  and  waiting  times  and  U  is 
the  idle  time  of  the  server  that  ends  at  rt,  each  for  a  particular  realization  of  the 
schedule. 


Server  overtime  is  defined  as  the  time  from  some  user-defined  point,  tv,  to  the 
time  of  completion  for  the  last  customer.  Overtime  is  a  generalization  of  constructs 
used  in  previous  research.  For  example,  by  setting  rv  =  0,  overtime  becomes  the 
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total  expected  time  the  server  must  be  available,  which  is  the  sum  of  the  expected 
service  time  and  the  expected  idle  time.  Since  the  expected  service  time  is  fixed, 
minimization  of  overtime  in  this  case  is  equivalent  to  minimization  of  idle  time. 

Customers  are  constrained  to  arrive  between  0  and  the  schedule  horizon,  77,. 
The  first  scheduled  arrival  time,  Tj,  is  fixed  at  zero,  clearly  its  optimal  value  when 
lateness  is  not  permitted. 

A  distinction  will  be  made  throughout  between  the  schedule  of  arrivals  and  the 
sequence  of  arrivals.1  To  some  extent,  the  sequence  is  determined  by  the  schedule, 
but  if  two  customers  have  identical  arrival  times,  the  cost  may  vary  depending  on 
who  is  served  first.  The  approach  will  be  to  decompose  the  problem  into  two  parts: 
finding  the  optimal  schedule  for  a  given  sequence  and  finding  the  optimal  sequence. 

The  cost  of  operation  of  a  particular  schedule/sequence  of  customers  is  defined 
as  a  convex  combination  of  the  expected  customer  waiting  times  and  expected  server 
overtime.  With  the  inclusion  of  the  constants  rh  and  r„,  this  formulation  is  quite 
flexible.  For  example,  by  setting  tv  =  0  and  setting  the  schedule  horizon  to  a 
sufficiently  large  value,  overtime  becomes  the  total  time  the  server  must  remain  in 
operation,  and  the  cost  function  becomes  one  considered  in  several  works  [160,  97]. 
Another  commonly  used  cost  function  can  be  obtained  by  setting  tv  =  rh  +  A,  where 
A  is  the  smallest  schedule  increment  allowable  (lattice  size),  in  which  case  overtime 
assumes  a  more  traditional  meaning  [145,  158]. 

Some  special  cases  considered  in  this  dissertation  can  be  classified  by  the  fol¬ 
lowing  characteristics: 

-  Service  distribution.  The  development  of  the  optimization  algorithm  will  as¬ 
sume  a  general  form  for  each  customer’s  service  time  distribution,  restricted 

throughout  this  dissertation,  the  term  “sequence”  is  used  to  refer  to  the  sequence  of  planned 
customer  arrivals,  while  “schedule”  refers  to  the  vector  of  planned  arrival  times  for  each  customer. 
These  definitions  are  natural  to  this  problem,  and  they  were  advocated  by  earlier  researchers  in 
scheduling  theory  [22:  p  450]  [146:  p  295].  However,  they  are  a  departure  from  current  usage  [130:  pp 
15-16]. 
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only  by  the  requirements  that  each  be  of  bounded  variation,  have  only  positive 
support,  and  be  independent  of  the  other  quantities  involved.  The  cost  evalu¬ 
ation  implementation  will  restrict  the  form  of  the  service  distributions  further, 
requiring  them  to  be  Coxian,  with  the  addition  of  distinct  probabilities  of  each 
customer  requiring  zero  service  ( i.e .,  failing  to  show  for  the  appointment).  The 
special  cases  of  Erlang  and  iid  (identically,  independently  distributed)  services 
without  no-shows  will  also  be  examined,  since  they  lead  to  simplifications  in 
the  evaluation. 

Arrival  constraints.  In  practice,  the  scheduling  period  is  always  bounded  and 
frequently  is  of  fixed  length.  The  horizon  does  not  restrict  the  end  of  service 
or  the  overtime  point.  Bounded,  variable-length  schedules  are  assumed  unless 
otherwise  stated.  In  the  pure  scheduling  problem,  the  sequence  is  assumed 
already  fixed,  and  so  arrival  times  are  constrained  by  0  =  T\  <  •  •  •  <  rjv-i  < 

Tn  <  Th. 

Server  overtime.  Unless  otherwise  stated,  it  will  be  assumed  that  rv  =  r^. 
This  models  a  commonly  observed  appointment  system,  in  which  the  server 
continues  to  be  available  to  accept  new  customers  right  up  until  closing  time. 

Arrival  discipline  -  block  vs.  individual  strategies.  Scheduling  strategies  typi¬ 
cally  are  classified  as  individual,  block,  or  mixed.  Individual  strategies  allow 
each  arrival  to  be  scheduled  separately,  while  block  strategies  constrain  arrivals 
to  occur  only  at  the  beginning  of  each  block  of  time.  Blocks  need  not  be  the 
same  size  or  contain  the  same  number  of  customers.  Mixed  strategies  typically 
consist  of  a  set  of  simultaneous  arrivals  at  the  beginning  of  each  block,  with  the 
remainder  of  the  customers  scheduled  individually.  This  study  seeks  only  the 
optimal  individual  schedule;  since  it  is  the  least  restrictive  and  encompasses 
the  other  cases,  the  optimal  individual  schedule  cost  is  always  less  than  the 
optimal  costs  for  block  or  mixed  schedules. 
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Arrival  discipline  -  continuous  vs.  lattice.  In  this  dissertation,  the  case  where 
arrival  times  are  lattice  (i.e.,  restricted  to  occur  only  at  discrete  intervals)  re¬ 
ceives  attention,  particularly  the  case  of  evenly  spaced  lattices.  While  the  cost 
under  this  condition  can  be  obtained  trivially  from  a  continuous  formulation  of 
the  problem,  the  regularly-spaced  lattice  arrival  case  is  addressed  separately  for 
three  reasons.  First,  in  many  cases  it  is  realistic.  Most  appointment  systems 
permit  arrivals  to  be  scheduled  only  at  regular  intervals;  none  allow  for  arrivals 
in  continuous  time,  and  few  even  schedule  to  the  nearest  minute.  Second,  it 
provides  for  simpler  computation  of  the  cost  function.  Last,  the  optimization 
of  any  function  in  lattice  space  requires  special  consideration,  since  the  opti¬ 
mum  in  lattice  space  generally  is  not  obtainable  merely  by  rounding  off  the 
solution  of  the  corresponding  continuous  case. 

Arrival  discipline  -  no-shows  and  customer  punctuality.  The  probability  of  a 
customer  showing  at  the  scheduled  time  and  entering  the  queue  is  distinct  and 
independent  of  queue  size  or  other  characteristics.  Accounting  for  no-shows 
is  of  critical  importance  in  many  systems,  since  show  rates  can  be  as  low  as 
70%  [40,  66,  147].  Because  a  no-show  may  be  considered  as  an  infinitely  late 
customer,  it  is  reasonable  to  believe  that  accounting  for  no-shows  is  often  of 
greater  importance  in  modeling  a  system  than  accounting  for  lateness.  The 
show  probability  is  assumed  to  be  1.0  for  each  customer  unless  specified. 

Queueing  discipline.  It  is  assumed  throughout  that  customers  are  served  in 
the  order  they  are  scheduled.  This  is  equivalent  to  first  in,  first  out  (FIFO), 
except  in  section  3.6,  where  customer  lateness  is  incorporated  into  the  model. 

Scheduling  goal.  The  schedule  may  be  dynamic  or  static.  In  the  dynamic  case, 
the  schedule  of  future  arrivals  may  be  revised  at  any  time,  taking  advantage  of 
all  information  regarding  past  events  and  the  current  state  of  the  system.  The 
static  case  fixes  the  schedule  prior  to  the  start  of  service.  The  latter  case  will 
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be  assumed.  In  some  works  the  two  problems  are  referred  to  a  the  short-  and 
long-range  versions  of  the  problem. 

-  Optimization  goal.  It  has  been  clear  from  past  work  in  control  of  arrivals  to  a 
queue  that  the  arrival  time  that  optimizes  expected  cost  from  an  individual’s 
narrow  point  of  view  often  differs  from  that  selected  in  order  to  optimize  the 
expected  global  cost  [113].  Optimization  is  intended  in  the  global  sense  in  this 
work.  There  is  a  single  objective;  multicriteria  optimization  is  examined  only 
briefly. 

1.3  Definition  of  Symbols,  Terms,  and  Acronyms 

Symbols  and  acronyms  are  defined  at  their  first  use  throughout  this  document. 
For  the  reader’s  convenience,  those  that  are  used  more  than  once  are  also  defined 
below.  Those  symbols  and  acronyms  that  are  used  only  once  are  not  defined  below. 
The  lower-case  symbols  g,  h,  i,j,  k,  m,  and  n  are  used  exclusively  for  indices  and  thus 
may  have  different  meanings  in  different  sections. 

Kendall’s  notation  for  queues  is  used  [80].  S(N)  is  used  to  denote  a  queue 
with  N  deterministic  arrival  times  in  which  interarrival  times  may  not  be  constant. 
The  standard  notations  for  sequencing  problems  will  be  seen  to  be  inadequate  for 
this  sequencing/scheduling  problem.  When  referring  to  other  research,  the  notation 
for  sequencing  problems  is  that  suggested  by  a  number  of  authors  and  modified  by 
Pinedo  [130]. 

The  following  trademarks  are  mentioned:  MATLAB  (The  Math  Works,  Inc.), 
Pentium  (Intel),  PowerStation  (Microsoft),  and  SparcStation  (Sun). 
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Table  1.  Definitions  of  terms  and  variables 

agreeable  A  set  of  customers  with  deterministic  services  for  which  Xi  <  Xj 
implies  ct  >  c3  \/i,j  is  said  to  possess  agreeable  weights. 

block  bidiagonal  Refers  to  a  matrix  that  can  be  partitioned  into  blocks  such  that 
the  [i,j]  block  is  nonzero  only  if  j  =  i  or  j  =  i  +  1. 

bj  The  probability  of  immediate  completion  of  a  Coxian  service, 
given  the  jth  phase  was  just  completed.  A  second  subscript  is 
added  if  not  all  customers  have  identical  service  distributions. 

combination  A  set  of  objects  undistinguished  by  order.  The  permutations 
AAB,  ABA,  and  BAA  are  all  represented  by  the  same  combina¬ 
tion,  denoted  by  AAB  in  this  research. 

confluent  Refers  to  eigenvalues  that  are  equal. 

convex  combination  \}x3  is  a  convex  combination  of  the  elements  of  x  if  A  is 

a  vector  of  constants  such  that  A  j  >  0  for  all  j  and  T,jLi  Xj  =  1. 

customer  class  A  subset  of  customers,  each  member  of  which  is  indistinguishable 
in  terms  of  their  probability  of  arriving  on  time,  cost  of  service, 
and  service  time  PDF,  prior  to  schedule  implementation. 

c  Vector  of  unit  costs  of  waiting.  Also  used  as  the  coefficient  of 
variation  of  a  distribution. 

cn+\  Unit  cost  of  server  overtime. 

C(t,c )  Cost  of  schedule  of  arrivals  r.  The  second  argument  is  dropped 

if  all  unit  costs  of  waiting  are  equal. 

C  Optimal  cost  over  all  schedules. 

Xj  Service  time  of  the  jth  customer  in  a  particular  realization. 

Xjj  Expected  remaining  service  time  of  the  jth  arrival,  given  it  is 
currently  in  its  ith  Coxian  stage  of  service.  The  initial  subscript 
is  dropped  if  all  customers  are  identical. 
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Table  1.  Definitions  of  terms  and  variables  (continued) 

The  problem  of  determining  an  optimal  schedule  sequence  policy 
at  any  time  during  schedule  implementation,  given  information 
on  the  realization  of  the  stochastic  variables  (service  time  and 
whether  the  customer  arrived)  for  each  previous  customer  [129], 
This  is  equivalent  to  closed  loop  control  of  arrivals  [152]. 

Adjective  denoting  a  set  of  probability  PDFs  that  may  not  be 
all  identical. 

In  the  case  of  evenly-spaced  lattice  arrival  times,  the  smallest 
possible  positive  time  step. 

Erlang  service  PDF  with  r  phases  or  stages  of  service. 

To  prove  to  be  suboptimal.  A  decision  fathoms  a  set  of  alter¬ 
native  decisions  if  the  cost  incurred  under  the  decision  is  lower 
than  that  incurred  under  each  alternative. 

First-in,  first-out  service  protocol. 

PDF  of  service  time  for  the  jth  customer. 

When  comparing  two  vectors,  x  A  y  iff  <  Hi  Vi. 

x  ~<  y  iff  x  A  y  and  x  ^  y. 

Probability  of  the  jth  customer  showing  at  the  appointed  time. 
The  subscript  is  dropped  if  all  customers  have  identical  show 
rates.  Also  used  as  the  skewness  of  a  distribution. 

Independently,  distinctly  distributed. 

Independently,  identically  distributed. 

The  expected  idle  time  the  server  incurs  waiting  for  customer  j. 
The  greatest  integer  less  than  x. 


xVy  =  [max(x1>  yx),  max(x2, 2/2),  — ]  is  the  join  of  vectors  x  and 

y- 

number  of  time  slots  in  a  lattice  schedule  in  which  a  customer 
may  be  scheduled. 


9 


Maclaurin  series 
mean  residual  life 


meet 


NLP 


norm 


N 

N1 


"(j) 

fy.i 


PDF 

preemption 


priority 


P 


Table  1.  Definitions  of  terms  and  variables  (continued) 

Special  case  of  the  Taylor  series,  expanded  about  zero. 

The  mean  residual  life,  L(t),  is  the  expected  remaining  “life”  of 
a  process  at  t,  given  the  process  is  still  “alive”  at  t:  L(t)  = 

E[x  -  t\x  >  t}. 

xAy  =  [min(a:i,?/1),min(a:2,?/2),  •  •  •]  is  the  meet  of  vectors  x  and 

y- 

The  service  rate  of  the  jth  phase  of  a  Coxian-r  PDF.  A  sec¬ 
ond  subscript  is  added  if  not  all  customers  have  identical  service 
distributions. 

Nonlinear  program,  a  means  of  finding  the  minimum  value  of  a 
nonlinear  function  or  a  function  subject  to  nonlinear  constraints. 

The  norm  of  A,  ||A||,  is  defined  as  various  metrics  of  a  matrix 
(or  vector).  It  provides  some  measure  of  the  “size”  of  A. 

Number  of  customers  to  be  scheduled. 

Index  of  the  first  customer  optimally  scheduled  at  the  horizon 
in  the  optimal  schedule  for  a  given  sequence.  This  definition 
applies  only  for  deterministic  problems. 

The  number  of  customers  arriving  in  slot  j. 

The  expected  waiting  time  of  the  jth  customer,  given  the  system 
is  in  its  ith  state  (in  a  phase- type  representation). 

Probability  density  function. 

Temporarily  removal  of  the  current  customer  from  service  in 
order  to  serve  a  higher  priority  customer. 

A  ranking  of  customers  by  importance.  In  a  priority  queue  with¬ 
out  preemption,  the  highest  priority  customer  present  is  served 
as  soon  as  the  current  customer  completes  service. 

Transition  matrix,  not  including  exit  phase. 

The  ith  noncentral  moment  of  a  distribution. 
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Table  1.  Definitions  of  terms  and  variables  (continued) 

The  ith  scaled,  noncentral  moment  of  a  distribution. 

A  row  vector  of  N  +  1  elements,  all  equal  to  one. 

Transition  matrix,  including  exit  phase. 

Denotes  the  set  of  all  rational  numbers. 

The  value  of  a  random  variable  in  a  particular  trial. 

The  number  of  phases  in  the  Coxian  service  distribution  of  the 
ith  customer.  The  subscript  is  dropped  if  all  customers  have 
identical  service  distributions.  The  variable  is  also  used  briefly 
in  Chapter  II  to  refer  to  release  dates. 

Right-shift  operator,  with  accumulation  in  the  last  entry:  if  x  — 
[12345],  R(x)  =  [01239].  It  is  also  defined  as  the  matrix  for 
which  R(x)  =  xR. 

Denotes  the  set  of  real  numbers.  5ft+  denotes  the  positive  real 
numbers,  including  zero. 

Determining  arrival  times. 

Determining  the  sequence  of  arrivals. 

The  problem  of  determining  an  optimal  schedule  sequence  policy 
at  any  time  during  schedule  implementation,  in  ignorance  of  the 
realization  of  the  stochastic  variables  (service  time  and  whether 
the  customer  arrived)  for  each  previous  customer.  This  is  equiv¬ 
alent  to  determining  the  optimal  schedule  prior  to  schedule  start. 
It  is  also  referred  to  as  open  loop  control  of  arrivals  [152]. 

A  function  /  :  5?”  — >  5ft  is  submodular  on  5ft"  if 

f{x  Ay)  +  f{x  V|/)<  f{x)  +  f(y )  Vx,  y 

where  A  is  the  meet  operation  and  V  is  the  join  operation. 

The  set  of  numbers  for  which  a  PDF  is  nonzero  is  called  the 
support  of  that  PDF. 


11 


Table  1.  Definitions  of  terms  and  variables  (continued) 

S(N)  Used  by  several  references  to  denote  deterministic  arrival  times 
to  a  queue  -  e.g.,  S(N)/ G/l. 

Si  The  ith  schedule  being  considered.  Used  at  times  in  place  of  r, 
since  r,  refers  to  the  scheduled  arrival  time  of  the  ith  customer. 

S  Optimal  schedule  of  those  constrained  to  a  given  lattice. 

S  Optimal  unconstrained  schedule. 

t  time,  measured  from  the  start  of  the  scheduling  period. 

Th  The  schedule  horizon;  the  time  interval  within  which  customers 
may  be  scheduled. 

Tj  Arrival  time  of  the  jth  customer.  Tj  is  taken  to  be  zero  unless 
otherwise  specified. 

tv  The  time  that  overtime  costs  begin. 

9j  Show  vector.  Defined  as  1  if,  in  a  particular  realization,  the  jth 
customer  arrived  at  the  appointed  time,  zero  otherwise. 


upper  triangular  Denotes  a  matrix  A  in  which  Aij  —  0  Vi,  j  :  i  >  j. 

Wj  The  expected  waiting  time  of  customer  j. 

WSEPT  Acronym  for  weighted  shortest  expected  processing  time.  The 
customer  sequence  obtained  by  ordering  customers  from  smallest 
to  largest  value  of  E(xi)/c,  in  stochastic  sequencing  problems. 


WSPT  Acronym  for  weighted  shortest  processing  time.  The  customer 
sequence  obtained  by  ordering  customers  from  smallest  to  largest 
value  of  Xi/ci  in  deterministic  sequencing  problems. 


WSVPT  Acronym  for  weighted  shortest  variance  of  processing  time.  The 
customer  sequence  obtained  by  ordering  customers  from  smallest 
to  largest  value  of  VAR(xi)/ci  in  stochastic  sequencing  problems. 
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1.4  Overview 

A  part  of  the  problem  this  dissertation  addresses  -  the  scheduling  of  arrivals 
to  an  appointment  system  in  order  to  minimize  cost  -  is  one  that  has  been  ad¬ 
dressed  in  over  60  articles  in  the  last  47  years.  This  dissertation  builds  on  these 
earlier  attempts  to  model  appointment  systems  realistically  and  to  develop  efficient 
approaches  to  optimizing  the  schedule  for  a  given  sequence  of  arrivals.  It  provides 
substantial  improvements  in  both  these  areas.  In  addition,  this  effort  explores  the 
effect  of  changing  the  sequence  of  customer  arrivals,  and  it  offers  a  heuristic  algo¬ 
rithm  to  determine  the  optimal  sequence.  The  importance  of  sequencing  of  arrivals 
to  an  appointment  system  has  only  been  discussed  once  in  the  literature,  and  no 
optimization  algorithm  was  proposed.  The  literature  related  to  these  problems  is 
discussed  in  detail  in  Chapter  II. 

Chapter  III  considers  the  formulation  and  evaluation  of  the  function  represent¬ 
ing  the  cost  of  a  particular  schedule  and  sequence  of  arrivals  for  a  given  system.  It 
approximates  customer  services  with  Coxian  distributions  and  employs  a  continu¬ 
ous  Markov  chain  embedded  at  the  customer  arrival  epochs  to  determine  expected 
waiting  times.  Alternative  approaches  are  developed  for  general  distributions  and 
Erlang  distributions.  A  cost  evaluation  scheme  that  accounts  for  customer  lateness 
is  shown  here,  but  lateness  will  not  be  addressed  in  subsequent  sections.  The  nature 
of  the  cost  function  is  considered,  as  a  prelude  to  the  optimization  effort. 

The  optimization  problem  naturally  presents  itself  as  two  sub-problems.  Chap¬ 
ter  IV  addresses  the  determination  of  the  optimal  arrival  times  of  each  customer, 
given  the  sequence  that  the  customers  are  scheduled  to  arrive.  An  algorithm  to  find 
the  optimal  lattice  schedule  is  developed  and  proven  analytically  to  be  effective.  It 
is  based  on  the  piecewise  convexity  and  submodularity  (c/.  Section  2.5)  of  the  cost 
function  with  respect  to  scheduled  arrival  times. 

Chapter  V  addresses  the  determination  of  that  optimal  sequence  in  which 
customers  are  scheduled  to  arrive.  The  problem  is  demonstrated  to  be  complex,  with 
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optimal  sequences  appearing  chaotic  to  the  casual  observer.  Although  the  problem 
is  suspected  to  be  strongly  NP-hard,  a  heuristic  algorithm  is  shown  empirically  to 
perform  effectively  in  determining  the  optimal  sequence  in  polynomial  time. 

The  last  chapter  is  devoted  to  a  discussion  of  the  contributions  of  this  disser¬ 
tation  to  the  research  community.  Applicability  to  various  actual  problems  will  be 
evaluated.  Several  goals  for  future  research  and  possible  approaches  to  those  goals 
are  offered. 

The  deterministic  analogue  to  the  problem  of  sequencing  and  scheduling  ar¬ 
rivals  to  an  appointment  system  is  examined  in  Appendix  A.  While  this  problem  is 
unrealistic  in  and  of  itself,  the  complexities  seen  in  the  stochastic  solution  have  their 
root  in  this  problem,  and  it  is  therefore  worth  considering. 

Functions  with  the  same  piecewise  convex  and  submodular  structure  as  the 
one  addressed  here  are  ubiquitous  and  span  a  number  of  disciplines.  Appendix  B 
discusses  the  advantages  and  limitations  of  using  the  methods  applied  in  Chapter 
IV  to  optimize  such  functions  over  a  lattice. 

Appendix  C  addresses  the  complexity  of  the  optimization  methods  advocated 
in  this  dissertation.  In  particular,  it  is  shown  that  the  number  of  function  evaluations 
required  by  the  lattice  scheduling  algorithms  is  of  linear  order  with  respect  to  problem 
size  for  nearly  all  cases,  making  it  superior  to  other  optimization  approaches  beyond 
some  problem  size.  The  results  of  the  fixed-lattice  algorithm  are  compared  to  those 
of  other  methods  and  indicate  the  algorithm  performs  favorably  at  small  problem 
sizes  as  well. 

In  an  effort  to  gain  more  understanding  of  the  nature  of  the  problem,  Appendix 
D  examines  the  dependence  of  the  optimal  cost  and  schedule  on  a  variety  of  factors, 
including  service  moments,  show  rates,  and  unit  costs  of  waiting  and  overtime. 

Appendix  E  describes  a  study  performed  for  a  medical  clinic.  Their  appoint¬ 
ment  system  was  analyzed  and  the  potential  improvement  attained  from  sched- 
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ule/sequence  optimization  was  determined  to  be  67%.  Although  preliminary,  this 
study  is  strong  evidence  that  the  approaches  advocated  here  can  be  of  immediate 
practical  value. 

Appendix  F  describes  and  analyzes  approaches  to  finding  Coxian  distributions 
with  the  same  moments  as  those  of  some  empirical  distribution.  The  tools  developed 
include:  a  recursive  approach  to  determining  the  moments  of  a  Coxian  distribution; 
completion  of  earlier  researchers’  work  in  determining  the  bounds  of  the  Coxian-2 
distribution  in  moment  space;  and  the  determination  of  a  parsimonious  set  of  Coxian 
parameters  to  match  the  first  three  moments  of  a  given  distribution.  These  tools  were 
helpful  to  this  research,  since  Coxian  service  distributions  are  assumed  throughout. 

Appendix  G  provides  an  analysis  of  several  approaches  to  matrix  exponenti¬ 
ation.  Matrix  exponentiation  is  necessary  in  this  dissertation  to  finding  the  cost  of 
a  given  schedule.  Commonly-used  algorithms  such  as  those  of  Cayley  and  Hamil¬ 
ton,  Jordan,  and  Parlett  are  shown  to  have  numerical  instabilities  in  this  problem 
that  preclude  their  use.  No  such  problems  are  found  with  Pade  or  Maclaurin  series 
methods  when  coupled  with  a  scale-and-square  algorithm. 
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II.  Related  Work 


The  following  sources  are  divided  into  somewhat  artificial  categories.  Articles 
in  the  heuristic  appointments  section  emphasize  attacking  the  problem  from  a  prac¬ 
tical  standpoint  by  applying  a  heuristic  and  are  most  often  formulated  in  terms  of 
medical  patient  scheduling  and  sequencing.  On  the  other  hand,  articles  in  the  the¬ 
oretical  appointments  section  tend  to  start  from  a  theoretical  framework.  However, 
the  most  important  distinction  is  that,  with  few  exceptions,  researchers  classified  in 
one  category  appear  to  have  little  influence  on  the  efforts  of  researchers  in  the  other 
category.  Figure  2  shows  the  ideological  heritage  for  a  representative  selection  of  the 
64  articles  in  the  two  sections,  as  determined  by  the  articles  they  cite. 

Articles  in  the  control  of  queues  section  emphasize  formulation  of  the  problem 
in  terms  of  control  of  arrivals  and  do  not  rely  on  any  of  the  research  in  the  other 
sections,  although  some  of  the  researchers  cited  in  the  theoretical  appointments  sec¬ 
tion  mention  control  of  queues  in  passing.  All  three  of  these  sections  concentrate 
on  single  servers  and  identical  customers,  so  most  only  address  the  scheduling  prob¬ 
lem.  The  fourth  section  addresses  the  sequencing  problem.  While  the  dividing  lines 
between  the  four  research  areas  are  not  always  defined  clearly,  the  distinctions  are 
useful  here. 

Last,  the  literature  related  to  optimization  of  submodular  functions  is  reviewed, 
since  the  cost  function  discussed  here  will  be  shown  to  be  submodular  with  respect 
to  the  arrival  time  vector,  when  the  sequence  of  arrivals  is  fixed. 

The  first  published  considerations  of  the  problem  of  scheduling  arrivals  found  in 
this  literature  search  were  in  1951.  In  the  discussion  following  Kendall’s  presentation 
on  queueing  theory  in  general,  Herne  briefly  mentioned  the  scheduling  of  iron  ore 
ships  into  English  ports  as  a  long-standing  problem  [80].  There  was  no  discussion 
on  how  to  formulate  or  solve  the  problem.  In  the  same  year,  concerns  regarding 
medical  appointment  scheduling  (excessive  waiting  times,  lateness  of  patients  and 
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Figure  2.  Ideological  genealogy  for  a  representative  subset  of  research  on  schedul¬ 
ing  of  arrivals.  Connections  are  drawn  if  one  article  refers  to  the  other. 
Articles  are  referenced  by  the  first  letters  of  the  principal  author’s  last 
name  and  the  year 


staff,  and  the  need  to  collect  and  analyze  scheduling  data)  were  addressed  in  the 
medical  literature  by  Dale  [30]. 


2. 1  Heuristic  Appointment  Scheduling  Literature. 

Bailey  suggested  in  1952  that,  in  many  medical  clinics,  the  ratio  of  waiting 
time  to  service  time  was  too  high  and  could  be  addressed  by  application  of  queue¬ 
ing  principles.  In  particular,  he  argued  against  the  common  practice  of  blocking  - 
i.  e. ,  scheduling  a  set  number  of  patients  at  the  beginning  of  a  set  of  regularly-spaced 
intervals.  While  this  practice,  especially  the  then-common  use  of  a  single  block  for 
the  entire  day,  minimizes  the  doctor’s  idle  time,  it  also  maximizes  the  patients’  wait¬ 
ing  time.1  He  recommended  an  individual  scheduling  scheme,  in  which  patients  are 
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scheduled  individually  at  regular  intervals.  On  the  basis  of  earlier  work,  he  fit  trun¬ 
cated  Pearson  Type  III  curves  to  the  service  distributions  of  50  practices.  Using  a 
Monte  Carlo  approach,  he  obtained  waiting  time  distributions  for  a  small  number  of 
patients.  No  recourse  was  made  to  steady-state  approximations  of  expected  waiting 
time.  He  discussed  the  trade-offs  between  reducing  patients’  waiting  time  and  the 
doctor’s  idle  time  and  suggested  a  reasonable  target  ratio  between  the  two  could  be 
achieved  with  only  a  negligible  increase  in  doctors’  idle  times  [7,  8,  120]. 

Despite  Bailey’s  emphasis  on  individual  scheduling,  he  recommended  partial 
blocking;  an  optimal  schedule  should  have  several  patients  arriving  before  the  start 
of  service.  Given  his  assumption  of  punctual  arrivals,  this  strategy  would  only  serve 
to  increase  patient  waiting  time  over  that  achieved  for  individual  scheduling,  without 
decreasing  doctor  idle  time.  He  may  have  tacitly  relaxed  the  punctuality  assumption 
and  was  allowing  for  the  possibility  of  late  arrivals,  in  which  case  such  a  policy  is 
sensible  [7,  8]. 

Welch  echoed  many  of  Bailey’s  suggestions  but  emphasized  the  importance  of 
punctuality  in  reducing  queue  size,  both  for  patients  and  for  doctors.  He  approached 
the  subject  with  less  rigor,  proposing  a  mixed  block-individual  appointment  scheme. 
In  this  scheme,  interarrival  times  were  all  set  equal  to  the  average  service  time,  and 
two  patients  arrived  at  the  start  of  the  day.  There  are  several  problems  with  this 
proposal.  Equal  interarrival  times  do  not  yield  optimality  in  the  transient  case. 
Interarrival  times  should  always  be  greater  than  the  mean  service  time  to  prevent 
waiting  time  building  over  the  day.  Unless  customers  are  likely  to  be  very  late, 
placing  a  second  customer  at  the  beginning  of  the  day  increases  waiting  time  with 


1  The  fact  that  appointment  systems  are  a  rather  recent  invention  is  evident  from  a  comment 
in  the  1955  Nuffield  study,  citing  a  1932  British  study:  “They  believed  that  any  comprehensive 
system  of  appointment,  giving  each  outpatient  a  separate  time,  was  ‘an  impossible  ideal’;  but  they 
thought  nevertheless  that  in  some  of  the  special  departments  an  appointment  system  for  individual 
patients  or  groups  of  patients  might  be  possible  [120]  .”  A  doctor’s  1964  comment  that,  “One  of  the 
suggested  improvements  in  general  practice  has  been  the  introduction  of  an  appointment  system 
for  the  patients”  implies  that  individual  appointment  systems  were  by  no  means  ubiquitous  even 
then  [45]. 
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no  concomitant  decrease  in  idle  time.  However,  Welch  believed  these  simplistic  rules 
led  to  a  reasonable,  albeit  suboptimal,  solution  [166].  Several  articles  discuss  actual 
applications  of  this  scheduling  scheme  [45,  69,  167]. 

Soriano  extended  Welch’s  work,  comparing  costs  for  block,  individual,  and 
mixed  schedules.  By  appropriate  choices  of  (constant)  interarrival  time  and  the  ini¬ 
tial  block  size  in  a  mixed  block-individual  system,  a  reduction  in  waiting  time  of 
50%  was  achieved  in  a  hospital  outpatient  department.  He  also  obtained  steady- 
state  waiting  time  distributions  for  various  load  factors  in  M/G/l  and  D/Er/ 1  sys¬ 
tems  [149,  150]. 

Computer  simulations  were  the  most  common  approach  to  the  problem  from 
1964  to  1981.  White  and  Pike  concentrated  on  systems  in  which  arrival  times  are 
not  deterministic  and  services  are  identically  and  independently  distributed  (iid), 
not  necessarily  exponentially.  They  proposed  a  block  system  in  which  a  day  is 
divided  into  approximately  10  blocks  and  a  fixed  number  of  patients  are  scheduled 
at  the  beginning  of  each  block.  Average  interarrival  times  were  still  set  equal  to 
the  mean  service  time  [168].  Fetter  and  Thompson  studied  effects  of  patient  load, 
lateness  of  both  patient  and  physician,  and  variations  in  interarrival  times  [38]. 
Katz  developed  a  Monte  Carlo  model  of  hospital  outpatient  scheduling  that  took 
into  account  lateness,  multiple  physicians  and  their  schedules,  and  lab  scheduling. 
A  candidate  schedule  could  then  be  input  to  determine  areas  with  excessive  waiting 
times  [79].  Several  simulations  were  applied  to  the  problem  subsequently  [9,  36,  51, 
55,  59,  39,  59,  85,  86,  139]. 

Fries  and  Marathe  considered  a  system  in  which  arrival  times  are  deterministic, 
services  are  iid  exponential,  and  cost  is  a  function  of  total  patient  waiting  time,  server 
idle  time,  and  server  overtime.  They  proposed  a  variable  sized,  multiple  block  system, 
in  which  the  number  of  schedule  blocks  is  predetermined,  but  their  duration  is  the 
product  of  the  mean  service  time  and  the  number  of  patients  scheduled  to  arrive  at 
the  beginning  of  that  block.  They  proved  that  the  cost  is  convex  with  respect  to 
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the  arrival  time  vector,  which  consists  of  the  number  of  patients  scheduled  at  the 
beginning  of  each  slot.  This  is  the  first  known  proof  of  convexity  of  cost  function 
for  a  type  of  finite-customer  queue.  The  optimum  schedule  was  then  determined 
using  dynamic  programming  [44].  Although  arrival  times  were  restricted,  this  is  also 
the  first  theoretical  consideration  of  a  system  in  which  the  interarrival  times  were 
not  necessarily  equal.  For  this  reason,  the  research  should  be  classified  with  the 
theoretical  appointment  scheduling  literature.  However,  it  is  discussed  here  because 
none  of  the  researchers  discussed  in  the  next  section  cited  it,  while  subsequent  authors 
cited  in  this  section  were  aware  of  it  and  used  it. 

Charnetski  considered  the  problem  of  individually  scheduling  a  fixed  number  of 
surgeons  into  a  hospital  operating  suite  during  a  fixed  period.  He  approximated  the 
cost  of  the  surgeons’  total  waiting  time  and  the  cost  of  idle  time  of  the  suite  (which 
includes  both  the  facility  and  the  dedicated  operating  room  personnel)  using  a  Monte 
Carlo  simulation.  Arrival  times  were  deterministic,  and  service  distributions  were 
distinct  truncated  Gaussians.  He  found  the  optimal  interarrival  time  which  would 
balance  the  approximated  costs  of  waiting  and  idling  [21]. 

Weiss  independently  considered  a  nearly  identical  problem,  solving  it  com¬ 
pletely  for  two  customers  in  the  case  of  independently  and  distinctly  distributed 
(idd)  general  service  times  and  dynamic  scheduling.  He  offered  a  heuristic  for  the 
case  of  scheduling  more  than  two  customers  that  was  tantamount  to  one  iteration  of 
a  cyclic  coordinate  search.  His  is  the  first  published  attempt  to  solve  the  problem 
of  sequencing  distinct  surgical  procedures  [164].  The  general  literature  regarding 
scheduling  policies  for  operating  rooms  is  extensive,  but  much  is  not  pertinent  to  the 
problem  at  hand,  since  the  objective  is  usually  to  balance  waiting  times,  rather  than 
to  minimize  them.  The  interested  reader  is  referred  to  reviews  in  [132]  and  [102]. 

Ho  and  Lau  compared  a  number  of  block  scheduling  schemes  for  the  case  of 
deterministic  arrival  times,  but  with  possible  no-shows,  and  either  iid  uniform  or  iid 
exponential  service  distributions.  They  cited  previous,  unpublished  research  showing 
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that  distributions  that  differed  only  by  third  and  higher  moments  yielded  essentially 
identical  costs.  They  optimized  the  cost  with  respect  to  block  scheduling  parameters, 
as  well  as  the  number  of  patients  served.  A  number  of  scheduling  strategies  were 
considered,  and  a  set  of  eight  were  identified  as  optimal  under  different  parameters. 
Remarkably,  one  of  these  was  Bailey’s  and  Welch’s  scheme  of  40  years  earlier,  in 
which  two  customers  arrive  before  the  service  even  begins,  and  the  remaining  inter¬ 
arrival  times  are  set  equal  to  the  mean  service  time  [63,  64,  65].  This  work  paralleled, 
but  was  independent  of,  simulation  work  performed  much  earlier  by  researchers  at 
the  Air  Force  Institute  of  Technology  [12,  51,  59]. 

2.2  Theoretical  Appointment  Scheduling  Literature. 

More  mathematical  approaches  to  schedule  optimization  began  with  a  brief 
discussion  in  1956  in  Operations  Research  of  the  problem  of  minimizing  a  combination 
of  waiting  and  idle  times  for  a  steady-state  system  [159].  A  similar  problem  was 
formulated  and  solved  by  Morse  in  relation  to  the  scheduling  of  ships  to  a  docking 
facility  for  the  case  of  a  steady-state  M/M/1  system  [111]. 

Jansson  was  first  to  obtain  the  optimal  interarrival  time  of  a  steady-state 
D/M/1  queue,  in  1966  [70].  In  1968,  Grape  extended  Jansson’s  work  to  explore 
the  rate  of  convergence  of  transient  D/M/1  queues  to  steady-state.  As  might  be  ex¬ 
pected,  convergence  is  dependent  on  the  queue  size  at  the  start  of  customer  service 
and  on  the  utilization  (the  ratio  of  expected  service  time  to  interarrival  time).  This 
is  the  first  known  consideration  of  scheduling  arrivals  to  a  transient  queue  [56]. 

Fries  and  Marathe  investigated  the  scheduling  of  arrivals  to  a  S(Af)/M/l  queue 
under  transient  conditions,  as  discussed  above  [44],  The  S (N)  notation  denotes 
deterministic  arrival  times  with  distinct  interarrival  times.  They  optimized  variable- 
length  block  schedules  in  which  block  size  was  a  fixed  multiple  of  the  number  of 
customers  in  the  block.  Other  researchers  discussed  in  this  section  do  not  cite  this 
work  or  the  work  of  Grape. 
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Pegden  and  Rosenshine  addressed  the  optimization  of  variable-length  individ¬ 
ual  schedules  [125,  127].  In  this  formulation,  arrivals  are  scheduled  in  continuous 
time,  services  are  iid  exponential,  cost  is  a  linear  convex  function  of  expected  wait¬ 
ing  times  and  expected  server  availability  (the  overtime  as  measured  from  zero),  and 
the  scheduling  horizon  is  not  fixed.  Their  cost  function  is  a  special  case  of  the  cost 
function  proposed  in  this  dissertation.  For  this  S(iV)/M/l  problem,  they  were  able 
to  prove  convexity  for  the  cases  of  IV  =  2  and  N  =  3.  They  provided  the  exact 
solution  of  the  optimization  problem  for  N  =  2  and  solved  numerically  when  N  =  3. 
They  formulated  the  cost  function  for  larger  values  of  N  and  chose  a  Hooke-Jeeves 
optimization  because  of  the  difficulty  in  obtaining  derivative  information.  However, 
the  convexity  of  the  cost  function  in  all  cases  was  not  resolved.  Healy,  Pegden,  and 
Rosenshine  later  extended  their  work  to  consider  two  parallel  servers  (S{N)/M/ 2) 
for  a  small  number  of  customers  [60,  61]. 

Difficulties  in  the  continuous-time  formulation  led  Pegden  and  Rosenshine  to 
restrain  arrival  times  to  fixed  lattice  points.  As  mentioned  above,  this  formulation  is 
more  representative  of  most  actual  problems.  They  used  dynamic  programming  to 
solve  the  dynamic  version  of  the  problem,  in  which  the  scheduling  decisions  for  future 
arrivals  are  revised  at  each  time  interval.  They  suggested  approaches  for  solution  of 
the  static  problem  as  well.  Convexity  of  the  cost  function  for  N  >  3  was  conjectured 
but  not  established  [126]. 

Liao  extended  these  results  to  the  iid  S(N)/£r/l  case  of  lattice  arrival  times 
and  a  finite  schedule  horizon.  He  obtained  the  optimal  dynamic  schedules  by  a 
recursive  scheme  and  then  used  these  solutions  as  lower  bounds  to  solve  the  static 
case  by  a  branch-and-bound  algorithm.  The  cost  function  was  also  formulated  for 
multiple  class  and  multiple  server  problems.  While  he  proved  convexity  of  the  cost 
function  for  the  dynamic  version  of  the  problem,  convexity  for  the  static  version  for 
N  >  3  was  still  unresolved.  Liao  noted  the  applicability  of  his  work  to  just-in-time 
inventory  systems  [95,  96,  97]. 
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Simeoni  proposed  a  different  search  procedure  for  the  problem  Liao  proposed, 
implementing  it  for  the  case  of  iid  Erlang  services.  This  procedure  modified  the 
schedule  by  only  one  customer  arrival  at  a  time  but  in  a  way  that  fathomed  large 
amounts  of  the  solution  space.  The  issue  of  convexity  was  still  unresolved  [145]. 
Vanden  Bosch  and  Dietz  showed  that,  while  Simeoni’s  proof  was  incomplete,  the 
method  was  sound  and  relied  on  the  submodularity,  rather  than  the  convexity,  of 
the  cost  function.  They  extended  the  algorithm  to  the  optimal  scheduling  of  arrivals 
with  iid  Erlang  service  distributions,  and  proved  it  was  applicable  to  general  service 
distributions  if  the  cost  could  be  evaluated  [158].  Further  generalizations  of  the 
algorithm  are  possible  and  will  be  discussed  in  Chapter  IV. 

Wang  was  the  first  to  prove  stochastic  convexity  of  a  cost  function  for  the 
general  S(N)/G/1  problem.  His  cost  was  a  convex  function  of  waiting  times  and 
server  availability  (the  sum  of  the  scheduling  horizon  and  overtime),  the  arrival 
time  space  was  continuous,  and  the  scheduling  horizon  was  not  fixed  or  bounded. 
He  addressed  both  the  static  and  a  dynamic  scheduling  problem  for  iid  phase  (PH) 
service  distributions.  He  obtained  the  optimal  solution  by  applying  a  gradient  search 
to  a  series  of  differential  equations  in  matrix  form  [160].  Subsequently,  he  addressed 
the  applicability  of  the  problem  to  just-in-time  systems  in  which  steady-state  is 
not  reached  due  to  work  stoppages  [161].  Two  works  not  yet  published  propose 
efficiencies  in  his  approach  and  compare  optimal  transient  schedules  to  the  steady- 
state  results  of  Jansson  [57,  163]. 

Wang  noted  the  substantial  simplifications  to  obtaining  expected  waiting  time 
that  accrue  using  phase-type  distributions.  However,  most  of  his  results  depend  on 
placing  the  phase  transition  matrix  in  Jordan  canonical  form  [160].  As  will  be  shown 
in  Appendix  G,  this  can  cause  severe  floating  point  errors  in  certain  situations  that 
are  frequently  encountered. 

Several  authors  subsequently  investigated  the  scheduling  of  arrivals  at  equal 
intervals,  assuming  that  customers  are  allowed  to  arrive  late  according  to  some  distri- 
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bution.  Several  authors  approached  the  issue  of  lateness  from  a  simulation  approach 
[38,  55,  79,  168].  Winsten  first  considered  an  analytical  approach,  in  which  he  modi¬ 
fied  a  D/M/lqueue  to  allow  for  lateness.  The  lateness  distributions  were  iid  and  were 
general,  with  the  restriction  that  customers  were  not  allowed  to  change  order.  He  de¬ 
termined  steady-state  measures  of  this  system  [171].  Mercer  extended  these  results 
to  multiple  servers,  bulk  arrivals,  and  a  more  general  staged  service  distribution. 
While  he  did  not  allow  the  jth  customer  to  arrive  before  the  ( j  —  l)st,  he  did  allow 
it  to  arrive  at  any  time  later.  Again,  steady-state  measures  were  sought  [104,  105]. 
Sabria  and  Daganzo  examined  a  transient  queue  with  scheduled  arrivals  (i.e.,  an 
appointment  system)  in  which  customers  are  forced  to  balk  if  they  do  not  arrive 
in  the  order  scheduled.  Lateness  distributions  and  service  distributions  are  iid  and 
general.  They  obtained  approximations  to  queue  length  and  waiting  time  that,  for 
the  steady-state  case,  were  in  error  by  less  than  10%  [137]. 

Several  researchers  have  considered  similar  deterministic  scheduling  problems. 
None  directly  apply  to  the  deterministic  analogue  to  the  problem  of  scheduling  ar¬ 
rivals  to  an  appointment  system,  which  is  addressed  in  Appendix  A.  They  are  not 
reviewed  here,  but  the  interested  reader  is  directed  to  several  recent  articles  with 
good  surveys  [10,  11,  81,  88,  144]. 

2.3  Control  of  Queues  Literature. 

A  number  of  the  above  researchers  have  suggested  the  scheduling  of  arrivals 
problem  is  connected  to  the  control  of  arrivals  to  a  queue.  In  these  problems,  de¬ 
cisions  are  made  at  various  times  whether  to  accept  or  reject  entry  of  an  arrival 
to  the  system.  While  none  of  the  literature  in  this  area  was  deemed  applicable  to 
the  current  problem,  a  brief  examination  of  the  literature  is  appropriate,  if  only  to 
provide  the  reader  with  an  understanding  of  why  this  approach  was  rejected. 

Heyman  was  the  first  to  consider  a  problem  in  the  control  of  arrivals  to  a 
queue.  He  developed  optimal  policies  for  a  M/G/l  queue  whose  output  is  controlled 
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by  turning  the  server  off  and  on  [62].  Naor  addressed  the  control  of  arrivals  to  a 
M/M/1  queue  by  imposing  an  entrance  fee  on  arrivals  and  allowing  each  customer 
to  decide  whether  to  enter  the  queue.  He  obtained  optimal  strategies  both  for  the 
case  in  which  each  customer  exercises  its  own  narrow  self-interest  and  for  the  case 
in  which  each  customer  considers  the  expected  cost  to  the  customers  as  a  group. 
Although  these  strategies  both  are  of  the  form,  “enter  the  queue  if  the  cost  is  less 
than  some  fixed  amount”,  they  are  not  equivalent  [113]. 

Johansen  and  Stidham  considered  a  problem  close  to  the  one  of  interest.  They 
controlled  a  transient  GI/G/1  queue  by  accepting  or  rejecting  arriving  customers. 
A  reward  was  accrued  for  every  accepted  customer,  but  a  cost  based  on  the  waiting 
time  was  also  incurred.  Problems  were  considered  in  which  the  customers  exercised 
self-interest  and  in  which  the  global  interest  was  optimized.  One  of  the  difficulties 
they  encountered  was  again  the  question  of  convexity  of  the  cost  function  [71]. 

Several  sources  provide  good  reviews  of  the  subsequent  research  on  control  of 
arrivals  to  a  queue  [27,  33,  82,  148,  153].  These  efforts  may  be  classified  by  several 
characteristics. 

-  Control  discipline.  Control  may  be  open-loop,  in  which  case  the  current  de¬ 
cision  must  be  made  in  ignorance  of  past  and  current  states  of  the  system, 
or  closed-loop.  These  optimal  control  strategies  are  equivalent  to  the  optimal 
static  and  dynamic  strategies,  respectively. 

-  Queueing  system.  Researchers  have  addressed  multiple  servers,  balking,  reneg¬ 
ing,  multiple  customer  classes,  networks  -  in  short,  the  whole  spectrum  of 
queueing  systems. 

-  Objective  function.  The  objective  may  be  to  minimize  average  expected  cost, 
cost  variance,  maximum  expected  cost,  average  expected  queue  length,  or  some 
other  possibility.  Costs  may  be  customer-oriented  or  system-oriented.  As 
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mentioned  above,  it  must  be  specified  whether  the  basis  for  customer  decisions 
is  self-interest  or  global  interest. 

-  Equilibrium.  The  control  of  arrivals  literature  addresses  only  steady-state 
cases,  with  few  exceptions  [34,  71,  82]. 

-  Control  space.  Control  of  arrivals  generally  has  been  pursued  by  modifying 
service  rates,  by  modifying  cost  functions,  or  by  routing  arrivals  to  different 
servers. 

Researchers  are  deterred  when  trying  to  apply  these  efforts  in  control  of  arrivals 
to  the  problem  of  scheduling  arrival  times  to  an  appointment  system  for  several 
reasons.  First,  few  of  the  efforts  in  control  of  queues  address  transient  queueing 
problems.  Second,  control  in  this  problem  is  achieved  by  modifying  the  arrival 
times,  a  situation  the  control  literature  does  not  appear  to  address.  Last,  because 
the  parameters  in  the  cost  functional  are  not  time-dependent,  the  control  formulation 
does  not  provide  any  benefit;  it  degenerates  to  an  optimal  design  formulation,  and 
nonlinear  programming  (NLP)  techniques  are  directly  applicable  [27].  Thus,  no 
advantage  is  accrued  by  formulating  it  as  a  control  of  arrivals  problem.  As  a  result, 
the  problem  will  not  be  considered  further  in  terms  of  control  of  arrivals. 

2-4  Sequencing  Literature. 

In  the  case  of  identical  customers,  the  problem  can  be  approached  strictly 
from  a  queueing  theory  standpoint.  Stochastic  scheduling  theory  is  irrelevant,  since 
it  is  concerned  with  the  optimal  sequencing  of  customer  services.  However,  when 
the  customers  differ  in  cost  coefficients,  service  distribution,  or  are  otherwise  distin¬ 
guishable,  the  sequence  of  the  customers  is  relevant.  Therefore,  a  brief  summary  of 
the  pertinent  literature  in  stochastic  scheduling  is  appropriate. 

Sequencing  problems  may  be  classified  by  a  number  of  characteristics.  The 
general  form  of  the  problem  at  hand  would  be  defined  by  a  single  machine  with  N 
jobs  to  be  processed,  each  with  a  release  date  (arrival  time)  to  be  determined.  The 
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objective  function  in  a  scheduling  problem  typically  is  described  in  terms  of  release 
dates,  r3\  completion  times,  D3\  due  dates,  d3;  and  tardiness,  T3  =  max  (0,  D3  -  d3). 
The  proposed  objective  function  may  be  put  in  a  stochastic  scheduling  context  by 
defining  d3  =  r3+1  =  t3+u  in  which  case  W3  =  T3_u  where  W3  is  the  expected  waiting 
time  of  the  jth  customer. 

These  characteristics  define  a  l\r3\J2ctTi  system  [130],  but  with  three  unique 
characteristics.  First  and  most  importantly,  the  release  dates  are  dependent  on  the 
sequence.  That  is,  when  the  sequence  is  changed,  the  optimal  arrival  times  under 
that  sequence  are  generally  altered.  This  is  a  situation  not  addressed  in  either 
deterministic  or  stochastic  scheduling  literature,  with  the  exception  of  Wang’s  and 
Weiss’s  articles,  as  discussed  below  [162,  164],  Second,  the  due  dates  are  dependent 
on  the  release  dates.  (This  is  not  true  of  the  deterministic  analogue  discussed  in 
Appendix  A,  in  which  all  due  dates  are  fixed  at  rh.)  Third,  in  the  cost  formulation 
discussed  in  Chapter  III,  the  (N  +  1)*‘  release  date,  r„,  is  fixed.  The  goal  is  to 
determine  both  the  release  dates  and  order  of  jobs  that  will  minimize  the  objective 
function,  under  these  special  conditions. 

If  customers  in  the  appointment  system  have  identical  service  distributions  and 
interarrival  times  are  fixed  and  equal,  so  that  customers  differ  only  in  cost  functions, 
expected  waiting  times  increase  with  the  customer  index.  The  optimal  static  strategy 
in  this  situation  clearly  is  to  sequence  the  job  release  dates  in  the  order  of  decreasing 
unit  costs.  This  is  a  special  case  of  a  strategy  commonly  called  weighted  shortest 
expected  processing  time  first  (WSEPT),  in  which  expected  processing  times  are 
weighted  by  the  (linear)  cost  coefficients.  The  problem  with  generalizing  this  result 
is  that,  for  the  sequencing  of  arrivals  to  an  appointment  system,  the  scheduled  arrival 
times  shift  as  the  sequence  is  altered,  making  it  difficult  to  compare  any  improvement 
in  optimal  cost  as  the  sequence  of  arrivals  changes. 

More  difficult  are  cases  for  which  neither  cost  functions  nor  service  distributions 
are  identical.  Cox  and  Smith  considered  an  M/G/l  queue  in  which  customer  classes 
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had  different  arrival  rates,  service  distributions,  and  cost  coefficients  in  a  linear  cost 
function.  For  the  case  in  which  preemption  is  forbidden,  they  proved  the  optimal 
dynamic  strategy  is  WSEPT  [26],  Kakalik  and  Little  extended  the  work  of  Cox  and 
Smith  to  allow  the  server  to  remain  idle  rather  than  serve  a  customer.  They  proved 
that  WSEPT  is  still  optimal  and  that  the  server  should  never  choose  to  remain 
idle  [78].  (This  strategy  of  avoiding  idle  time  is  not  optimal  in  the  case  of  multiple 
servers,  however  [14].)  This  dependence  of  the  optimal  sequence  only  on  the  first 
moment  of  the  service  distribution  is  not  likely  to  be  the  case  for  finite  queues. 

Another  starting  point  is  the  literature  regarding  the  sequencing  of  jobs  that 
have  identical  release  dates,  random  services,  and  in  which  the  objective  is  to  mini¬ 
mize  the  cost  of  tardiness.  Rothkopf  showed  that  in  such  cases,  and  for  a  large  class 
of  service  distributions,  allowing  preemption  will  not  improve  the  optimal  cost  [135]. 
Sevcik  addressed  the  case  of  dynamically  sequencing  jobs  with  cost  a  linear  function 
of  tardinesses,  identical  release  dates,  idd  general  service  distributions,  and  preemp¬ 
tion  allowed,  in  the  context  of  a  computer  job  queue.  He  showed  the  non-optimality 
of  a  WSEPT  strategy  and  proved  the  optimality  of  a  rule  he  called  smallest  rank 
(SR),  in  which  remaining  service  times  are  weighted  inversely  by  both  the  cost  coef¬ 
ficient  and  by  the  probability  the  request  will  be  processed  within  a  sufficiently  small 
time  interval,  and  then  ordered  accordingly  [142], 

Glazebrook  obtained  the  static  and  dynamic  policies  for  a  similar  problem  with 
stochastic  services,  identical  release  dates,  an  arbitrary  set  of  precedence  relations, 
and  a  cost  that  is  a  general  function  of  the  decision  taken  at  each  time.  In  rough 
terms,  he  proved  that  if  the  mean  residual  life  of  each  service  distribution  is  non¬ 
increasing,  and  if  the  unit  cost  of  completing  each  individual  job  does  not  increase 
over  the  processing  of  that  job,  then  an  optimal  strategy  that  optimally  orders  the 
customers  at  the  start  of  the  process  does  exist.  This  holds  true  whether  preemp¬ 
tion  is  allowed  or  not.  He  proposed  a  modification  to  Glazebrook’s  and  Gitten’s 
algorithm  [50]  to  determine  the  optimal  ordering  [48]. 
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The  above  approaches  may  have  limited  applicability,  since  the  jobs  have  identi¬ 
cal  release  dates.  Pinedo  considered  optimal  order  when  cost  is  defined  as  a  weighted 
sum  of  job  completion  times,  release  dates  are  distinct,  and  preemption  is  allowed 
(l|pmtn,  Tj\  J2  CjDj).  He  found  that,  if  services  are  idd  exponential,  the  optimal 
static  and  dynamic  policies  are  WSEPT  [129]. 

As  mentioned  in  the  heuristic  appointment  scheduling  section,  Weiss  indepen¬ 
dently  considered  the  static  problem  with  release  dates,  distinct  general  services, 
and  a  cost  that  is  a  function  of  waiting  and  idle  times,  in  the  context  of  schedul¬ 
ing  and  sequencing  surgeons  to  an  operating  room.  He  proved  sequencing  rules  for 
the  two-customer  case  but  was  unable  to  generalize  them  to  larger  numbers  of  cus¬ 
tomers.  He  proved  that  if  the  service  distributions  were  exponential  or  Gaussian,  the 
optimal  sequence  for  two  customers  would  have  the  customer  with  lowest  variance 
service  arrive  first,  but  he  was  unable  to  extend  this  result  to  other  service  distri¬ 
butions  or  to  larger  numbers  of  arrivals.  He  simulated  several  sequencing  examples 
to  demonstrate  the  apparent  efficacy  of  a  smallest- variance-first  scheme.  This  is  the 
first  known  attempt  to  solve  both  the  scheduling  and  sequencing  problems  [164], 

Wang  recently  extended  his  earlier  scheduling  efforts  to  include  the  sequencing 
of  multiple  classes  of  customers.  Customer  services  are  assumed  to  be  exponential 
but  may  have  different  means,  and  cost  is  a  linear  function  of  total  waiting  time 
and  server  availability  ( i.e C\  =  •  •  •  =  and  tv  =  0).  He  hypothesized  that  the 
optimal  sequence  orders  arrivals  by  decreasing  exponential  rate  and  pointed  out  that 
this  policy  orders  arrivals  by  increasing  service  variance.  While  there  is  empirical 
evidence  for  this  conjecture  when  the  overtime  point  is  at  zero,  he  was  unable  to 
prove  it  [162]. 

2.5  Optimization  of  Submodular  Functions 

The  cost  function  to  be  discussed  is  both  convex  and  submodular  with  respect 
to  the  arrival  time  vector,  as  long  as  the  sequence  of  arrivals  is  fixed.  A  function 
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/  :  5ft"  — >  5ft  is  submodular  on  5ftn  if 


f(x  At/)  +  /(iV  y)  <  f{x)  +  f(y)  \/x,y 
where  A  and  V  are  the  meet  and  join  operations,  defined  (for  this  purpose)  by 

xAy  =  [min  (xu  yi ),  min  (x2,  y2 ),...] 

xV  y  =  [max(rr1,yi),max(x2, 2/2)1  •  •  •] 

Submodular  functions  were  first  explored  by  Lorentz  [101],  Fan  named  them  subaddi¬ 
tive  [37],  while  Marshall  and  Olkin  coined  the  term  L-subadditive,  to  avoid  confusion 
with  functions  for  which  f(x  +  y)  <  f{x)  +  f(y)  [103].  However,  submodular  is  the 
term  in  common  use  currently  [157]. 

The  problem  of  ordering  the  permutations  of  a  vector  relative  to  some  submod¬ 
ular  function  has  been  addressed  by  a  number  of  authors,  and  a  recent  survey  can 
be  found  in  Chang  and  Yao  [20].  However,  the  cost  function  used  in  this  dissertation 
is  not  submodular  when  the  sequence  of  arrivals  is  altered,  so  these  efforts  are  not 
relevant  to  the  sequencing  problem. 

The  problem  of  maximizing  a  submodular  function  has  extensive  application, 
and  surveys  can  be  found  in  two  recent  articles  [47,  94] .  Although  the  work  is  equally 
applicable  to  minimization  of  a  supermodular  function,  it  is  not  of  help  in  minimizing 
a  submodular  function,  so  it  is  not  reviewed  here. 

Topkis  proved  that  the  set  of  points  over  which  a  submodular  function  attains 
its  minimum  is  a  sublattice.  (M  is  a  sublattice  of  L  if  M  C  L  and  x,y  e  M 
implies  x  A  y  e  M  and  x  V  y  €  M.)  Using  this  result,  he  proposed  a  general 
approach  to  minimization  [155,  156],  He  pointed  out  a  number  of  problems  that 
hinge  on  modification  of  a  submodular  function,  including:  the  min-cut,  max-flow 
problem  in  graph  theory;  an  optimal  pricing  strategy  problem;  and  optimal  control 
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of  an  unreliable  system.  The  approach  advocated  in  this  dissertation  is  in  a  sense  a 
modification  of  his  approach. 

Goemans  and  Ramakrishnan  recently  addressed  the  problem  of  finding  the 
minimum  cut  in  a  graph  and  also  framed  it  in  terms  of  minimizing  a  submodular 
function  [52],  This  effort  was  apparently  independent  of  Topkis’s. 

2. 6  Summary 

The  major  goal  of  this  research  is  to  determine  the  optimal  sequence  and 
schedule  of  customer  arrivals  efficiently,  given  various  circumstances.  The  approach 
to  the  scheduling  problem  will  be  analytical,  while  the  sequencing  problem  will  be 
treated  heuristically.  A  summary  of  the  relevance  of  the  above  literature  to  these 
goals  follows. 

The  research  in  the  area  of  control  of  queues  appears  peripheral  to  this  effort, 
although  perhaps  another  researcher  might  formulate  the  problem  differently  and 
find  an  application  of  these  approaches  to  the  problem.  The  heuristic  appointment 
literature  holds  some  promise  in  guiding  this  research  toward  realistic  problems, 
although  the  modeling  and  empirical  approaches  used  in  their  schedule  optimizations 
are  not  applicable  here. 

This  dissertation  would  be  classified  as  part  of  the  theoretical  appointment  lit¬ 
erature,  and  it  is  therefore  not  surprising  that  most  of  the  efforts  in  that  section  are 
relevant  here.  Liao’s  approach  to  lattice  schedule  optimization  [95,  96,  97]  was  suc¬ 
cessful  and  must  be  considered  as  an  alternative  to  the  approach  that  will  be  proposed 
here.  The  various  NLP  approaches  to  schedule  optimization  must  also  be  consid¬ 
ered  [60,  61,  126,  160].  The  approach  to  cost  evaluation  that  will  be  proposed  here  is 
similar  to  Wang’s  embedded  continuous  Markov  chain  approach  [57,  160,  163].  The 
schedule  optimization  approaches  that  will  be  advocated  for  various  circumstances 
are  refinements  of  Simeoni’s  basic  idea  [145,  158],  which  bear  a  close  relationship 
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to  Topkis’s  framework  [155,  156],  The  other  theoretical  appointment  literature  dis¬ 
cussed  is  also  relevant,  albeit  less  directly  so. 

Very  few  efforts  have  addressed  appointment  sequencing.  Wang’s  unpublished 
attempt  at  sequence  optimization  for  exponential  services  -  albeit  imaginative  -  was 
unsuccessful  and  quite  limited  in  scope  [162].  Likewise,  Weiss  obtained  the  optimal 
sequence  for  systems  with  only  two  customers,  limiting  its  usefulness  [164].  The 
methodologies  used  in  these  efforts  yield  little  promise  for  more  general  problems. 
Based  on  results  in  Appendix  A  for  optimally  sequencing  deterministic  appointment 
systems,  analytical  approaches  will  be  rejected  in  favor  of  a  good  heuristic.  No  pre¬ 
vious  research  has  addressed  the  sequencing  of  customers  with  deterministic  services 
to  an  appointment  system. 
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III.  Objective  Function  Formulation 

In  many  optimization  schemes,  evaluation  of  the  objective  function  consumes 
the  majority  of  the  computation  involved.  It  is  therefore  important  to  examine  meth¬ 
ods  of  efficiently  evaluating  the  cost  function.  This  chapter  examines  cost  evaluation 
under  a  variety  of  circumstances. 

Recall  that  tx  is  the  scheduled  arrival  time  of  customer  i  and  is  controllable. 
The  waiting  time  of  the  ith  customer  for  a  particular  realization  is  often  expressed 
in  this  chapter  as  Wi{r)  or  Wj(x)  to  emphasize  the  dependence  of  the  waiting  time 
on  the  vector  of  arrival  times  or  service  times.  The  unit  cost  of  the  ith  customer’s 
waiting  time  is  ct.  Customers  are  indexed  in  order  of  their  arrivals. 

The  cost  is  considered  to  be  a  convex  combination  of  expected  waiting  times 
for  the  N  customers  and  the  expected  server  overtime,  where  server  overtime  is  the 
time  past  tv  (overtime  point)  that  the  server  must  continue  to  serve  customers.  The 
coefficients  in  this  convex  combination  are  the  c{.  The  usual  requirement  for  convex 
combinations  that  c{  =  1.0  will  sometimes  be  useful,  but  will  usually  be  ignored, 
since  the  problem  can  always  be  transformed  to  suit  this  requirement  by  scaling  the 
unit  costs,  as  long  as  they  are  nonnegative  and  at  least  one  is  positive.  This  condition 
will  always  be  assumed.  The  unit  costs  are  also  assumed  to  be  constant. 

For  the  purposes  of  calculation  and  notation,  it  is  convenient  to  add  a  fictitious 
(TV  +  l)st  customer  to  the  schedule  at  tv.  This  new  customer  is  not  permitted  service 
until  the  Nth  customer  completes  its  service.  Thus,  the  expected  server  overtime  is 
E[Wn+1(t)}. 

Given  these  definitions,  the  total  cost  associated  with  a  particular  arrival  time 
vector  is 

N+ 1 

C (r)  =  Y.  *E[Wi(-r)\  (1) 

i= 2 
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This  cost  is  thus  a  function  of  the  arrival  time  vector  r,  the  overtime  point, 
tv,  and  the  expected  waiting  time  vector  E\W(t)\,  which  in  turn  is  a  function  of  r, 
the  service  probability  density  functions  (PDFs),  /i,  /2, . . .  fN,  and  the  probabilities 
of  customers  arriving  for  appointments,  71, 72,  •  •  • ,  In-  The  first  arrival  time  is  fixed 
at  0,  and  the  constraint  r*  >  0  applies  for  all  i.  The  constraints  tx  <  rh  Vi  apply  if 
arrival  times  are  bounded,  where  rh  is  called  the  schedule  horizon.  The  constraints 
ri  5:  t2  <  •  •  •  <  Tyv  apply  if  the  sequence  of  arrivals  is  to  remain  fixed.  The  task  at 
hand  is  to  efficiently  determine  or  approximate  the  expected  waiting  time  for  each 
customer,  given  general  service  distributions  and  prompt  arrivals  unless  failing  to 
show. 

Two  approaches  are  examined  here.  The  first  is  to  obtain  an  exact  expres¬ 
sion  for  expected  waiting  times  for  general  service  distributions.  The  expression  for 
the  jth  customer  turns  out  to  be  a  j’-fold  convolution  integral,  so  it  generally  will 
be  necessary  to  approximate.  The  second  approach  is  to  approximate  the  service 
distributions  with  phase-type  distributions  or  with  Erlang  distributions  and  exploit 
the  memoryless  property  of  the  phases.  Also,  the  simplifications  that  accrue  from 
restricting  arrival  times  to  lattice  points  are  examined.  Last,  a  brief  look  at  some 
properties  of  the  cost  function  provides  the  reader  with  a  better  understanding  of 
the  optimization  problems  treated  in  subsequent  chapters. 

3.1  General  Service  Distribution 

Assume  for  the  moment  that  no-shows  are  forbidden  (i.e.,  =  1  Vk).  The 

expected  waiting  times  can  be  determined  by  a  convolution  argument.  For  each 
possible  vector  of  service  times  x,  it  is  well-known  [46]  that 

W3  =  MAX  [0,  Wj_ i  +  xj- i  -  D  +  Tj_i]  (2) 
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Then  for  z  >  0, 


P  (Wj  <z)  =  P  {Wj-i  +  Xj- 1  -  Tj  +  Tj—i  <  z) 

—  P  (Wj-i  +  Xj-i  <  ¥>) 

=  [P{W,_l  <v-X,-,)dFJ.1(xi-I)  (3) 

where  Fj(x)  is  the  cumulative  distribution  function  (CDF)  associated  with  fj(x),  and 
V7  =  Tj  ~  Tj-i+z.  This  expression  is  just  the  transient  form  of  Lindley’s  waiting  time 
result  [99].  From  this  equation,  the  waiting  time  PDFs  can  be  obtained  recursively. 
To  obtain  the  expected  waiting  times,  recall  that  E(W])  =  /0°°  P  (Wj  >  z)  dz.  Then 

E(Wi)  =  I"  0  “  [  P(W’~'  <  V- Xj-JdFi-Axj-S)  dz.  (4) 

Now  consider  the  more  general  case  where  customers  may  fail  to  show  but  are 
otherwise  punctual.  Define  a  show  vector,  6,  where  8j  =  1  if  customer  j  showed 
for  the  appointment  in  a  particular  instance  and  Oj  =  0  otherwise.  The  plan  is  to 
determine  E(Wj  |  =  1)  by  conditioning  on  the  value  of  0,_i.  With  probability 

1  —  7j-i>  the  service  time  of  customer  j  —  1  is  zero,  and  its  waiting  time  is  also  zero. 
However,  just  for  the  purposes  of  calculating  E(Wj  |  Oj  =  1),  one  can  picture  that 
customer  j  -  1  always  showed,  waited  for  service,  and  then  either  completed  service 
immediately,  with  probability  1  —  Tj— 1>  or  else  underwent  service  of  length  Xj- i,  with 
probability  Since  E(W\)  =  0,  for  j  >  2, 

roc 

E(Wj\ej  =  l)  =  (l-lj_1)  /  P(Wj_1><p\0j„1  =  l)dz 

J  0 

rcc  rip 

+7i-i l  l  (l-P(Wi-i<^-xJ-i\ej.1  =  l))dFj.1(xi-l)dz  (5) 
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Since  E  (Wj  |  Oj  —  0)  =  0,  it  is  a  simple  matter  to  obtain  customer  f  s  expected 
waiting  time  by  conditioning  upon  whether  it  arrives  or  not: 

E(Wj)=ljE(WJ\9j  =  l)  (6) 

In  general,  these  integrals  will  not  have  an  analytic  solution  and  will  require 
numerical  approximation.  Since  multiple  convoluted  integrals  must  be  evaluated, 
the  potential  for  approximation  errors  is  compounded.  This  approach  clearly  is 
computationally  oppressive  for  most  service  distributions  and  for  problems  of  realistic 
size. 

3.2  Coxian  Distribution 

A  phase- type  (PH)  distribution  is  defined  as  the  distribution  of  times  required 
to  transit  a  network  of  exponential  stages,  or  phases.  The  Coxian- r  distribution  is  a 
phase-type  distribution  defined  here  by  the  particular  network  in  Figure  3.  In  this 
figure,  bi,b2,...br  represent  routing  probabilities  as  shown  and  /ii, . .  .fiT  represent 
the  exponential  phase  rates  at  each  stage.  Unlike  Cox’s  original  formulation,  the 
routing  probabilities  will  be  constrained  to  be  positive,  and  the  phase  rates  will  be 
positive  and  real  [25].  This  is  still  sufficiently  general  to  model  distributions  with 
support  on  3?+  (the  set  of  positive  real  numbers). 
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The  utility  of  the  Coxian  is  twofold.  First,  with  appropriate  choice  of  number 
of  stages,  service  means  and  routing  probabilities,  it  can  approximate  most  distribu¬ 
tions.  Cox  demonstrated  it  can  represent  exactly  any  distribution  whose  PDF  has  a 
rational  Laplace  transform,  if  complex  transition  rates  are  allowed  [25],  and  Newman 
and  Reddy  showed  that  the  Laplace  transform  of  any  PDF  may  be  approximated 
arbitrarily  closely  by  a  rational  function  [119].  Thus,  a  Coxian  distribution  can  be 
used  to  approximate  any  general  distribution  [3],  If  the  support  for  a  PDF  is  5ft+, 
a  Coxian  distribution  with  real  transition  rates  and  positive  routing  probabilities 
suffices  to  approximate  the  PDF  to  any  desired  accuracy  [46] .  Approaches  to  parsi¬ 
moniously  approximating  a  distribution  with  a  Coxian  distribution  are  discussed  in 
Appendix  F. 

Second,  since  the  Coxian  is  a  sum  of  fractions  of  convolutions  of  exponentials, 
Coxian  service  distributions  yield  Markovian  stochastic  processes  that,  due  to  the 
memoryless  property  of  the  exponential  distribution,  simplify  the  analysis  of  many 
systems  in  fields  such  as  queueing  theory,  insurance  risk  theory,  renewal  theory,  and 
reliability  [6]. 

Consider  Figure  3  as  the  state  transition  diagram  depicting  the  service  distri¬ 
bution  of  a  single  customer.  No-shows  are  allowed,  but  will  be  modeled  externally 
to  the  distribution.  (While  the  show  rate  could  be  considered  by  adding  a  rout¬ 
ing  around  the  first  phase,  and  one  could  thus  consider  7  =  £>o,  this  would  prevent 
a  phase-type  representation,  essential  to  the  argument  to  follow.)  Suppose  that  a 
Coxian  process  is  in  stage  k  at  time  to-  Let  Pi(t)  be  the  row  vector  representing  the 
probability  that  the  process  is  in  the  ith  stage  (state)  at  time  t.  If  i  <  k,  define 
Pi(t)  =  0.  Then  the  differential  equations  describing  this  pure  birth  process  for  a 
single  customer  are: 


=  -y.pj.t)  +  Vi  :  *  <  i 
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with  initial  conditions  pk(t0)  =  1  and  pi(t0)  =  0  Vi  :  i  ±  k.  Here,  the  exit  state 
is  defined  as  the  (r  +  l)st  state.  Define  br  =  1  for  consistency.  The  probability  of 
exiting  by  t  is  just  1  - 

If  Pi  =  ■■■  =  pr  —  p  and  &i  =  ...  =  br-i  =  1,  then  the  sojourn  time  is  Erlang- 
r(p)  distributed,  and  pi(t)  follows  a  truncated  Poisson  distribution.  In  other  cases, 
the  solution  of  these  differential  equations  does  not  lead  to  a  convenient  form  and, 
for  all  but  the  smallest  cases,  is  burdensome  to  calculate  by  hand.  For  this  reason, 
such  cases  are  often  handled  by  constructing  the  infinitesimal  transition  matrix  Q. 
The  solution  is  then 

p(t)  =  p(t0)e(f‘° Qdt )  =  p(t0)e[Q(t“to)]  (7) 


The  transition  matrix  without  absorbing  state  and  for  a  single  customer  is 


— pi  bipi  0  0 

0  ~p2  b2p2  0 

0  0  —ps  b^pz 


0 

0 

0 


0  0 
0  0 
0  0 


Pt  —  I  br—2pr—2  0 

0  pr~\  bT~\pr—\ 

0  0  —pT 


To  include  the  exit  state,  define  Tq  = 


(1  —  b\)pi  (1  —  b2)p2 


1  T 


pr 


,  which 


represents  the  transition  probabilities  to  the  exit  state.  Then 


T 

Tf 

Jo 

0 

0 

(8) 


Substitution  into  Equation  (7)  yields  the  probability  vector  of  the  number  of  stages 
completed  or  bypassed  at  time  t,  assuming  that  p(t0)  is  known,  that  no  customers 
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are  scheduled  to  arrive  between  t0  and  t,  and  that  only  one  customer  is  waiting  fox- 
service  at  t0.  The  task  now  is  to  build  an  algorithm  that  eliminates  these  assumptions 
and  produces  the  desired  probabilities  of  being  in  each  state  at  each  time. 


3.3  Cost  Evaluation  Algorithm  for  Coxian  Service 


The  distribution  of  completion  time  of  multiple  customers  may  be  found  by 
expanding  the  matrix.  Assume  for  now  that  all  customers  are  initially  available  for 
service.  Let  Tit i  and  T/0  be  the  transition  matrix  and  the  exit  probability  vector 
for  the  ith  customer,  of  size  n  x  n  and  ru  respectively.  Construct  Ti>0  by  appending 
ri+i  -  1  columns  of  zeros  to  Tf0.  Build  Q  as  above  for  the  first  customer,  but 
add  Ti  states  for  each  subsequent  customer,  using  the  first  state  of  each  subsequent 
customer  as  the  exit  state  of  the  previous  customer.  Thus,  if  customer  %  is  in  its  jth 
stage  of  service,  the  system  is  in  state  j  +  Y?k=i  Lt-  Now  Q  may  be  defined  as  the 
block  bidiagonal  form 


Ti,i  T\$  0  0 

0  Ti\  Tifi  0 
0  0  T34  T30 


0  0 
0  0 
0  0 


(9) 


0  0  0  0  •••  rr>1  t;0 

0000000 


If  there  is  a  possibility  of  a  customer  failing  to  enter  the  system,  subsequent  to 
the  currently  served  customer,  that  possibility  can  be  adjusted  for  in  the  following 
manner.  Let  j  -  1.  Consider  the  exit  phase  column  of  T  for  customer  j,  which 
is  the  initial  phase  of  j  +  1.  The  probability  of  customer  j  +  1  showing  is  jj+u  so 
move  1  -  7j+i  of  each  entry  in  j's  exit  phase  column  to  the  corresponding  entry  in 
the  exit  column  for  j  +  1.  This  is  the  new  T.  Increment  j  and  repeat  if  j  <  N  +  1. 
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The  resulting  transition  matrix  accounts  for  all  no-shows  but  the  first,  given  that  all 
customers  either  arrive  at  t  =  0  or  fail  to  show. 

In  the  cases  of  arrivals  at  other  times  than  t  =  0,  Equation  (7)  fails;  by 
the  nature  of  its  exponential  phases  with  real  transition  rates,  it  can  only  represent 
distributions  with  support  on  [0,  oo),  and  an  arrival  at  t?  0  would  require  support  on 
[t,  oo).  A  piecewise  strategy  overcomes  this  problem.  Assume  pfa)  is  known.  Define 
a  transition  matrix  Qj  that  accounts  only  for  those  customers  currently  present  in 
the  system,  and  apply  Equation  (7)  only  up  to  the  time  of  the  next  arrival  to  obtain 
p(Tj+i).  Then  add  in  the  rj+1  stages  of  customer  j  +  1  to  get  Qj+1.  Repeat  as 
necessary  to  find  the  probability  vector  at  the  desired  time. 

The  possibility  of  an  initial  no-show  is  a  problem  in  the  application  of  Equation 
(7)  as  well;  it  would  imply  the  existence  of  a  phase  with  instantaneous  service,  which 
can  only  be  represented  in  Q  by  the  limit  as  the  initial  phase  rate  of  customer  0  +  1) 
goes  to  infinity  (a  Bernoulli  distribution).  However,  with  the  piecewise  strategy,  this 
problem  is  avoided.  At  each  arrival  epoch,  one  need  only  modify  pfa)  by  shifting  a 
fraction  (1  -  7j)  of  the  probability  in  f  s  arrival  state  to  its  exit  state  before  applying 

Equation  (7). 

In  practice,  Qj  need  never  be  formed.  Instead,  Q  itself  can  be  used.  Although 
it  generates  p(rj+1)  from  p(r,),  which  may  have  a  nonzero  probability  of  being  in 
infeasible  states  (i.e.,  states  past  EU*  +  D,  this  infeasible  probability  mass  is 
merely  the  probability  that  further  service  would  have  taken  place  or  bypassed  had 
other  customers  been  available.  As  such,  it  can  be  accounted  for  by  summing  the 
probabilities  of  being  in  infeasible  states  and  transferring  this  probability  mass  to 
the  exit  state  of  customer  j.  Next,  1  -  7;+i  of  the  mass  in  customer  j’s  exit  state 
should  be  transferred  to  customer  (j  +  l)’s  exit  state,  to  account  for  the  possibility 
of  customer  ( j  +  1)  failing  to  show.  This  simplifies  the  calculation  of  p(t). 

Now  that  the  probability  of  being  in  each  state  at  each  time  is  determined,  a 
last  construct  is  required  to  determine  expected  waiting  times.  Define  a  conditional 
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expected  waiting  time  vector  in  which  represents  the  expected  waiting 

time  of  customer  j  conditioned  on  the  state  upon  f s  arrival  being  i.  (It  is  defined  as 
a  set  of  vectors,  rather  than  an  array,  only  for  notational  convenience.)  Let  Xh<i  be 
the  expected  remaining  service  time  of  customer  h,  given  that  it  is  in  the  ith  stage 
of  its  service.  Let  bh<i  be  the  transition  probability  from  the  ith  phase  to  the  (i  +  l)st 
phase  of  the  hth  arrival,  and  let  (Jh,i  be  the  transition  rate  of  the  ith  phase  of  the  hl 
arrival.  Then  by  inspection  of  Figure  3, 


Xhti  =  < 


[  M/ij  +  bhtiXk,i+i 


i  =  rh 
1  <i  <Th 


(10) 


can  be  used  to  recursively  obtain  all  of  X.  If  the  current  customer  is  h  and  it  is  in 
its  ith  stage  of  service,  then  by  inspection  of  Figure  4, 


0  j  <  h 

Xh,i  j  =  h+ 1 

0(j  -  l)h  +  j  >  h  +  1 


(11) 


can  be  used  to  obtain  all  values  of  each  vector  Cl(j).  The  expected  waiting  time  for 
customer  j,  assuming  it  does  not  have  to  wait  if  it  does  not  show,  is  then 


E(Wj)  =  7^0>(r,). 


(12) 


Wang  and  Gray  accomplished  the  same  result  for  the  case  of  iid  Erlang  ser¬ 
vices  by  a  scheme  that  requires  the  inversion  of  T  [57,  163].  Equations  (10)  and 
(12)  provide  an  efficient  alternative  to  their  scheme,  accounting  for  idd  services  and 
avoiding  the  potential  floating  point  problems  discussed  in  Appendix  G. 

The  following  algorithm  summarizes  the  procedure  outlined  above  for  obtaining 
the  expected  waiting  times  for  Coxian  service  with  no-shows. 
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Cost  Evaluation  Algorithm  For  Coxian  Service 


1.  Set  E(Wi)  =  0.  Setp(0)  =  [  1  0  0  ...  0  ]•  Set  j  =  1.  Set  h  =  n 

2.  Account  for  no-shows  by  replacing  p(Tj)[h  + 1]  with  ljP(rj)[h  + 1]  and  p{rj)[h  + 
rj+ 1  +  1]  with  (1  -  lj)p(jj)[h  +  1]. 

3.  Let  p(Tj+i)  =  p{jj) exp[Q(r,+i  -  t,)].  Let  h  =  h  +  rr 

4.  Let  p(Tj+1)[h  +  1]  =  Set  Pfo+i)[*]  =  0  for  each  *  >  h  +  L  If 

j  <  N,  let  j  =  j  +  1  and  return  to  step  2. 

5.  Obtain  the  expected  waiting  time  vector  from  Equations  (10),  (11),  and  (12). 

6.  Apply  Equation  (1)  to  obtain  the  cost  associated  with  this  schedule. 

This  algorithm  works  for  either  lattice  or  continuous  arrival  times.  In  the  case 
of  lattice  arrival  times,  some  simplifications  may  be  made,  and  these  are  discussed 
in  Section  3.5. 

When  arrivals  are  not  restricted  to  lattice  times,  there  is  not  a  practical  reason 
to  consider  the  waiting  time  in  the  event  two  customers  are  scheduled  to  arrive  at  the 
same  time;  such  a  schedule  cannot  be  optimal  unless  the  cost  of  customers  waiting 
is  zero.  However,  the  algorithm  does  work  for  simultaneous  arrivals.  More  efficient 
approaches  to  simultaneous  arrivals  will  be  addressed  in  detail  in  Section  3.5,  which 
discusses  lattice  arrival  times. 

The  accuracy  of  the  above  cost  evaluation  is  mainly  dependent  on  the  accuracy 
of  the  matrix  exponentiation  in  step  3,  since  the  remaining  of  the  operations  involve 
only  addition  and  subtraction.  The  accuracy  of  the  matrix  exponentiation  used 
is  discussed  in  detail  in  Appendix  G.  That  section  concludes  that  exponentiation 
becomes  less  accurate  as  two  phase  rates  become  extremely  close  without  coinciding, 
or  as  one  of  the  phase  rates  diverges  from  the  others.  If  neither  of  these  situations 
obtains,  the  exponentiation  is  assumed  here  to  be  accurate. 
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3.4  Erlang  Service  Distribution 


The  above  formulation  is  feasible  but  is  computationally  intensive  if  many  cost 
evaluations  are  to  be  made.  If  the  service  distributions  are  iid,  have  a  coefficient  of 
variation  less  than  or  equal  to  one,  and  it  is  acceptable  to  approximate  the  second 
moment  rather  than  match  it  exactly,  an  Erlang  approximation  leads  to  further 
savings  in  computation.  The  Erlang  distribution  with  r  stages  has  PDF 


m  = 


ne-^ty-1 

(r  -  1)! 


t  >  0 


(13) 


To  approximate  a  given  PDF,  the  parameter  ft  is  chosen  to  be  the  first  moment 
of  the  distribution.  Given  variance  (or  sample  variance,  if  one  is  approximating  an 
empirical  PDF)  of  a2,  the  number  of  stages  is  selected  by 


1 

int  [a2fi2 


(14) 


Since  the  variance  will  be  approximated  by  the  inverse  of  an  integer,  the  approxima¬ 
tion  will  be  more  accurate  for  smaller  variances. 


Liao  first  obtained  the  expected  waiting  times  for  the  case  of  iid  lattice  arrival 
times  with  an  Erlang  service  distribution  [95,  96,  97].  Here,  the  case  of  arrival  times 
not  restricted  to  a  lattice  is  examined  first.  Since  the  Erlang  distribution  with  r 
stages  is  just  a  special  case  of  the  Coxian  with  r  iid  stages  with  mean  service  rate 
fi  and  transition  probabilities  all  set  to  1.0,  the  above  algorithm  for  Coxian  services 
could  be  used  to  obtain  expected  waiting  times.  Computation  in  this  case  is  speeded 


greatly,  since 

Q  = 


-fi  fi  0  0  0  0 

0  —  fi  fi  0  •  •  •  0  0 

0  0  0  0  -fi  fi 

0  0  0  0  0  0  0 


(15) 
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and  eQ{t~to)  becomes  an  upper  triangular  matrix  for  which  the  (i,j)  entry  follows  a 
truncated  Poisson  distribution: 


0 


i  >  j 


eQ(t  £°)(i,  j)  =  \ 


e-r^(t-t0)(r^((_t0))i  * 

^j-1  e~r>1<‘-toH^(t-to))’n~' 

Aro=0  (m-i)\ 


i<j<rN  +  1 
j  —  rN  +  1 


(16) 


For  the  last  column  (index  j  =  rN  +  1),  the  entries  are  sums  of  all  the  Poisson 
probabilities  for  j  >  rN  +  1.  All  entries  are  easily  calculable  with  a  recursive 
routine,  reducing  computation  when  calculating  e0^-to).  No  further  simplifications 
are  helpful  for  the  case  of  Erlang-r  services  with  unrestricted  arrival  times. 


3.5  Lattice  Arrival  Times 

For  lattice  arrival  times,  simplifications  arise  from  two  sources.  Let  A  be  the 
smallest  allowable  time  interval.  First,  eQA  may  be  calculated  at  the  beginning  of  the 
algorithm,  obviating  the  need  to  calculate  to^  at  any  iteration,  since  t  —  to  —  kA 
for  some  integer  k,  and  eQkA  =  (eQA)  .  This  substantial  simplification  can  also  be 
applied  to  continuous  cases  by  imposing  a  lattice  that  allows  approximation  of  the 
arrival  times;  that  is,  by  setting  A  equal  to  the  largest  number  that  is  (approxi¬ 
mately)  an  integral  factor  of  each  of  the  interarrival  times.  It  should  be  noted  that, 
if  the  overtime  point  is  later  than  the  last  arrival  time  and  is  not  lattice,  another 
exponentiation  must  be  performed.  The  computer  program  EVALUATE  in  Section 
H.l  approximates  the  overtime  point  by  the  nearest  lattice  point  in  order  to  avoid 
this  second  exponentiation. 

Second,  the  likelihood  of  multiple  arrivals  at  an  instant  is  no  longer  infinites¬ 
imal.  While  the  Coxian-r  algorithm  can  be  used,  there  is  a  faster  approach.  Let 
k  be  the  index  of  the  first  customer  scheduled  to  arrive  at  rfc,  and  let  v(rk)  be  the 
number  of  customers  scheduled  to  arrive  at  rk.  E(Wk)  may  be  found  as  usual.  For 
subsequent  customers,  E(Wk+i )  =  E(Wk+ j_i)  +  for  i  G  [l,u(rfe)  -  1].  In  the 
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special  case  of  lattice  arrival  times,  iid  Erlang-r  services,  and  =  •  •  •  =  7^  =  7, 
E(Wk+i)  —  E(Wk)  +  ^7,  and  the  total  expected  waiting  time  of  customers  arriving 
at  tk  is 

£  E(W„+i)  =  v(n)E(W„)  +  7r’,(Tt)(,,(Tt>  - 11  (17) 

2=0 

The  following  summarizes  the  cost  algorithm  in  the  case  of  iid  Erlang  service, 
lattice  arrival  times,  and  constant  show  rate. 

Cost  Evaluation  Algorithm  For  iid  Erlang  Service 

1.  Construct  eQA  by  using  Equation  (16).  Let  p(0);  =  [  1  0  ...  0  ]•  Let  j  =  1. 
Let  To  =  0. 

2.  Let  v(Tj)  be  the  number  of  arrivals  at  Tj.  Let  Jc  =  (V  —  rj_1  j  /A. 

3.  Let  p(Tj)  =p(r?_1)(exp[<5A])fc. 

4.  Let  p(Tj)rj+ 1  =  Yli=rj+iP(Tj)i-  Let  p(t j)t  =  0  for  i  >  rj  +  1.  This  shifts  all  the 
infeasible  probability  mass  to  the  exit  state  of  j.  Let  h  =  0. 

5.  Let  p(Tj)r(j+h)+ 1  =  7P(7j)rO+/i)+i-  Let  p(r7)r(j+/l+i)+1  =  (1  —  7 )p{Tj)r(j+h)+\- 
If  h  <  v(rj),  then  increment  h  and  repeat  this  step.  This  adjusts  p (r,)  for 
no-shows  of  arrivals  at  rr 

6.  Find  the  sum  of  expected  waiting  times  for  arrivals  j  through  j  +  v(jj)  -  1 
using  Equations  (2)  and  (17). 

7.  Let  j  =  j  +  v(tj).  If  j  <  N  +  1,  return  to  step  2. 

8.  Apply  Equation  (1)  to  obtain  the  cost  associated  with  this  schedule. 

3.6  Modeling  Lateness 


The  previous  cost  formulation  assumed  punctual  customers  if  the  customer 
joined  the  queue  at  all,  but  allowed  for  the  possibility  of  no-shows.  This  is  the  case 


addressed  in  the  majority  of  this  dissertation  because  of  the  number  of  problems 
faced  in  optimizing  appointment  systems  where  lateness  is  permitted.  It  is,  however, 
possible  to  obtain  the  cost  of  an  appointment  system  in  which  each  customer  has 
some  lateness  distribution,  and  that  is  shown  in  this  section.  The  basic  approach 
is  to  roughly  model  the  lateness  distribution  as  a  bounded  discrete  distribution  and 
prorate  the  cost  of  each  possible  resulting  realization  of  the  lateness  values  by  its 
probability  of  occurrence. 

Several  authors  have  examined  the  effects  of  lateness  on  an  appointment  sys¬ 
tem  [137,  168],  but  the  continuous  distribution  approaches  used  would  preclude  the 
imbedded  Markov  approaches  used  so  far  in  cost  evaluation,  so  these  earlier  ap¬ 
proaches  are  rejected.  Here,  the  jth  customer  is  assumed  to  have  a  lateness  distribu¬ 
tion  ij(t)  that  is  idd,  discrete  with  support  at  lattice  points  only,  bounded  below  by 
0,  and  bounded  above  by  —  Tj.  The  lower  bound  of  0  is  for  convenience  only  and 
does  not  preclude  a  customer  arriving  early;  in  this  case,  one  need  merely  redefine  its 
arrival  time  to  the  earliest  possible  and  adjust  the  lateness  distribution  accordingly. 

Suppose  the  lateness  probabilities  for  customer  j  are  nonzero  for  e-j  values,  all  of 
which  are  schedule  lattice  points.  Because  the  lateness  distributions  are  independent, 
the  marginal  probability  of  a  given  realization  of  latenesses  is  just  the  product  of 
the  probabilities  of  each  lateness  occurring.  This  leads  to  the  formulation  of  the 
system  as  a  set  of  nf=i  instances  (in  which  the  customers  are  each  punctual).  The 
waiting  time  of  customer  j  is  just  the  sum  of  the  product  of  the  waiting  times  for 
j  obtained  in  an  instance  and  the  marginal  probability  of  that  instance,  over  the 
possible  instances. 

Assuming  that  the  customers  are  served  in  the  order  in  which  they  were  sched¬ 
uled,  rather  than  in  the  order  they  actually  arrived,  the  same  transition  matrix  Q 
can  be  used  for  each  sub-problem  as  was  used  in  the  punctual  case  defined  earlier. 
Because  the  state  space  is  not  increased,  and  since  only  a  single  matrix  exponentia¬ 
tion  need  be  performed  still,  the  additional  calculations  required  to  assess  cost  when 


47 


lateness  is  allowed  are  not  extensive.  However,  because  the  number  of  sub-problems 
that  must  be  evaluated  is  dependent  on  njli  the  number  of  possible  values  of 
lateness  to  be  considered  for  each  customer  will  affect  the  calculation  speed. 

If  the  service  discipline  is  FIFO,  on  the  other  hand,  customer  order  may  not 
be  the  same  for  each  possible  realization  of  the  lateness.  In  such  a  case,  Q  must  be 
reconstructed  and  re-exponentiated  for  each  possible  service  order.  This  will  lead  to 
longer  run  times  for  the  cost  evaluation  algorithm  under  FIFO. 

It  is  seen  that  the  cost  of  an  appointment  system  in  which  customers  can 
be  late  or  early  can  be  approximated  quite  effectively.  However,  this  dissertation 
does  not  consider  the  effect  of  lateness  further,  since  the  optimization  algorithms 
to  be  presented  will  depend  on  convexity  of  the  cost  function  and  the  equivalence 
of  the  scheduled  and  the  actual  arrival  order.  While  consideration  of  the  effects  of 
lateness  on  the  optimal  schedule  policy  is  undoubtedly  important,  that  effort  must 
be  relegated  to  future  research. 

3. 7  The  Nature  of  the  Objective  Function 

With  the  above  evaluation  tools  available,  the  nature  of  the  cost  function  is 
more  easily  explored.  Examination  of  some  rough  plots  will  help  the  reader  under¬ 
stand  the  nature  of  the  function  and  the  task  of  finding  the  optimal  schedule  and 
sequence.  In  Figure  5,  a  fixed  time  horizon  of  t  E  [0,  30]  was  imposed,  in  which 
three  identical  customers  with  iid  exponential  service  distributions  must  be  sched¬ 
uled.  Costs  are  linear  functions,  and  the  unit  costs  ci,  C2,  and  C3  are  equal.  The  unit 
overtime  cost,  C4  in  this  case,  is  arbitrarily  set  equal  to  C\  and  the  overtime  point  is 
set  to  Th .  The  probability  of  a  no-show  is  set  to  zero.  The  first  customer’s  arrival 
time  is  fixed  at  tx  =  0,  and  the  abscissae  represent  r2  and  r3.  Several  insights  can 
be  gleaned.  First,  the  plot  must  be  symmetric  about  r2  =  r3,  since  the  customers 
are  identical  in  every  way.  Even  if  the  customers  were  not  identical,  the  line  r2  =  r3 
would  divide  the  plot  into  two  convex  regions  (as  will  be  proved  in  Section  4.1). 
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This  line  represents  a  ridge;  waiting  time  is  locally  maximal  along  coordinate  axis 
directions  when  two  customers  are  scheduled  to  arrive  at  the  same  time. 

Minima  seem  to  occur  when  the  customers  are  scheduled  at  roughly  equal  time 
intervals.  However,  the  interarrival  times  are  not  in  general  equal  for  the  optimal 
schedule  with  identical  customers,  as  the  following  argument  shows.  Consider  a  very 
simple  case:  two  customers,  with  a  fictitious  third  customer  fixed  at  r3  =  rh  to 
account  for  server  overtime,  and  the  first  fixed  at  rj  =0.  The  wait  imposed  on 
customer  2  is  a  decreasing  function  of  r2,  and  the  wait  imposed  on  customer  3  is  an 
increasing  function  of  r2.  If  the  sum  of  the  service  times  of  customers  1  and  2  never 
exceeded  r3,  the  first  function  would  be  a  reflection  of  the  second  about  r2  =  r3/ 2, 
and  the  minimum  of  their  sum  is  clearly  attained  at  r2  =  t3/2.  When  the  possibility 
increases  that  the  sum  of  the  services  of  customers  1  and  2  exceeds  r3,  customer 
3’s  wait  also  increases,  but  customer  2’s  wait  is  unaffected.  This  disturbance  of  the 
symmetry  will  shift  the  optimal  value  of  r2  lower  (as  P[\ i  +  X2  >  t3]  increases). 

Figure  6  depicts  a  case  in  which  four  customers  are  to  be  scheduled  in  the  fixed 
horizon  t  €  [0,  40].  For  the  purposes  of  the  plot,  T\  =  0  and  r4  =  30  are  fixed,  while 
t2  and  t3  are  plotted  on  the  x  and  y  axes,  as  before.  Services  are  iid  exponential 
with  mean  of  5  time  units  for  customers  1,  2,  and  3,  while  customer  4’s  service  is 
exponential  with  mean  of  10  time  units.  Costs  are  linear  with  equal  coefficients, 
as  before.  Again,  the  plot  is  piecewise  convex  for  a  particular  order  of  customers, 
reaching  local  maxima  as  two  or  more  customers  are  scheduled  to  arrive  at  the  same 
time  (proved  in  Section  4.1).  The  task  of  locating  the  minimum  could  be  considered 
twofold.  One  problem  is  to  locate  the  optimum  schedule  for  each  order  of  customers 
(the  scheduling  problem),  while  the  second  is  to  find  the  smallest  of  these  optima 
(the  sequencing  problem).  Efficient  pursuit  of  these  two  objectives  to  obtain  the  best 
appointment  policy  is  the  goal  of  the  rest  of  this  dissertation. 
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IV.  Scheduling  Arrivals  When  the  Sequence  is  Fixed 

The  goal  of  this  chapter  is  to  provide  an  effective  tool  to  determine  the  op¬ 
timal  schedule  of  arrivals,  given  that  the  order  of  arrivals  is  fixed.  This  is  tanta¬ 
mount  to  searching  the  arrival  time  space  over  the  region  for  which  the  constraints 
Tj  <  T2  •  ■  •  <  tjv  hold. 

It  was  claimed  in  Section  1.2  that  one  may  set  T\  =  0  without  losing  gen¬ 
erality,  since  an  optimal  schedule  will  always  have  the  first  customer  arrive  at  the 
start  of  the  server’s  availability.  A  later  arrival  time  would  incur  increased  server 
idle  cost  without  a  reduction  in  total  waiting  cost.  An  earlier  arrival  time  would 
increase  waiting  time  for  the  first  customer  while  failing  to  reduce  idle  costs  or  other 
customers’  waiting  costs. 

Since,  as  will  be  proven,  the  cost  function  is  convex  with  respect  to  r  in  the 
admissible  region,  an  optimum  is  assured,  and  any  number  of  nonlinear  optimization 
schemes  may  be  applied  to  obtain  the  continuous  optimum.  For  instance,  Healy  et 
al.  applied  a  Hooke- Jeeves  optimization  [60,  61],  while  Wang  and  Gray  applied  both 
a  simple  gradient  search  and  a  conjugate  gradient  search  [57,  160]. 

For  lattice  arrival  time  problems,  enumeration  is  not  possible  unless  a  finite 
time  horizon  is  imposed,  and  even  then  it  is  prohibitive  for  moderately-sized  prob¬ 
lems.  Suppose  there  are  K  time  slots  and  N  customers,  the  first  of  which  has  a 
fixed  arrival  time.  By  a  simple  combinatorial  argument,  there  are  {N^-i 2)  possible 
schedules  to  evaluate.  A  problem  involving  20  customers  and  48  time  slots  would 
generate  (?|)  «  1.73- 1016  candidate  schedules.  Clearly,  another  approach  is  required. 

Liao  et  al.  utilized  a  bound  provided  by  the  optimal  dynamic  schedule  and 
applied  a  branch-and-bound  technique  to  solve  such  problems  [95,  96,  97].  Simeoni 
applied  a  variation  on  a  simple  coordinate  search  that  was  based  on  the  convexity 
and  submodularity  of  the  cost  function  [145],  The  approach  to  be  advocated  here 
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for  both  lattice  and  continuous  arrival  times  is  a  substantial  extension  of  Simeoni  s 
coordinate  search. 

An  example  may  help  focus  on  the  pertinent  aspects  of  this  optimization  for 
lattice  arrival  times.  Figure  7  shows  the  cost  of  several  schedules  with  3  customers 
and  5  time  slots.  Unit  waiting  costs  and  overtime  cost  are  identical  (c2  =  c3  =  c4), 
/\  =  |  and  the  overtime  point  is  located  one  slot  past  the  schedule  horizon  ( \tv 
-(- 1  =  5).  Services  are  iid  exponential  distributions  with  a  mean  of  1. 


Figure  7.  Plot  of  cost  vs.  coefficient  ratio.  Normalized  cost  vs.  normalized  over¬ 
time  cost  coefficient  for  several  schedules  with  5  slots  and  3  customers. 
The  dotted  line  is  the  optimal  cost  if  the  schedule  is  not  constrained  to 
be  lattice.  The  dashed  line  is  the  cost  if  the  interarrival  times  are  all 
set  equal.  .  _ _ 


The  actual  values  of  the  waiting  time  and  overtime  coefficients  are  also  irrele¬ 
vant  to  the  choice  of  optimal  schedule;  only  their  ratio  affects  the  optimum.  Figuie 
7  displays  the  total  schedule  cost  for  each  possible  c3  (=  c2)  and  c4  by  normalizing  c4 
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and  C  by  the  factor  c4  +  C3.  Only  the  six  schedules  (out  of  15  possible)  that  become 
optimal  for  some  choice  of  c4/(c4  +  c3)  (i.e.,  are  non-dominated)  are  displayed. 

The  schedule  r  =  (0,2,4)  is  seen  to  be  optimal  for  values  of  c4/(c4  +  c3)  <  0.57, 
with  r  =  (0, 1,  3)  optimal  for  values  of  c4/(c4  +  c3)  between  0.57  and  0.86,  and  (0,1,2), 
(0,0,2),  (0,0,1),  and  (0,0,0)  becoming  optimal  in  turn  as  the  overtime  cost  increases 
further  still.  For  any  problem,  the  set  of  non-dominated  schedules  may  be  determined 
similarly  over  the  range  of  cn+i/(cn+i  +  Qv). 

As  the  size  of  the  lattice  is  decreased,  the  polygonal  envelope  of  optimal  so¬ 
lutions  over  the  range  of  cost  coefficients  will  converge  pointwise  to  the  optimal 
solution  in  the  continuous  case.  This  can  be  seen  by  creating  a  series  of  functions 
fi(cN+i/(cN+i  +  cw)))  in  which  the  ith  function  is  the  optimal  solution  for  the  di¬ 
vision  of  the  schedule  into  i  +  1  lattice  points.  The  sequence  fi{x),  f2i{x),  f4i(x)... 
is  monotonic,  nonincreasing,  and  bounded  below  by  zero  for  each  value  of  x.  The 
sequence  of  functions  therefore  converges  pointwise,  and  its  limit  must  be  the  op¬ 
timal  solution  in  the  continuous  case.  Each  function  in  the  series  is  the  pointwise 
minimum  of  a  collection  of  functions  that  are  linear,  so  each  is  concave  with  re¬ 
spect  to  cN+1/(cN+1  +  c^),  from  which  it  follows  that  convergence  of  the  sequence  is 
uniform  [134:  Theorem  10.8].  Continuity  of  the  cost  function  for  continuous  arrival 
times  follows  immediately.  It  should  be  noted  that  this  argument  applies  equally 
when  the  unit  waiting  costs  are  not  all  equal;  the  scaling  factor  (c;v+i  +  cjv)  could 
just  as  easily  have  been  (cjv+ 1  +  Cv_i),  for  example. 

The  importance  of  continuity  lies  in  sensitivity  analysis;  a  small  modification 
in  the  importance  of  one  of  the  cost  terms  will  not  result  in  a  disproportionate 
improvement  or  degradation  in  the  optimal  cost.  Appendix  D  explores  the  sensitivity 
and  dependence  of  the  optimal  schedule  and  cost  on  various  schedule  parameters. 

Section  3.1  showed  the  cost  to  be  a  function  of  integrals  that,  by  Leibniz’s 
Theorem,  are  differentiable  over  r  if  the  service  PDFs  are  all  of  bounded  variation  [4], 
For  the  cost  function  in  Equation  (1),  they  are  also  differentiable  with  respect  to  c. 
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Let  0(c)  represent  the  optimal  cost  over  possible  schedules  for  a  given  unit  cost 
vector  c.  Because  C(c)  is  the  limit  of  a  uniformly  convergent  sequence  of  concave, 
differentiable  functions,  it  is  also  differentiable  with  respect  to  c  [134:  Theorem  25.7 
and  discussion  preceding]. 

The  straight  dashed  line  represents  the  solution  when  the  interarrival  times 
are  all  set  equal.  This  represents  the  traditional  approach  to  schedule  creation, 
and  in  this  case  it  is  close  to  the  optimum  for  coefficient  ratios  near  0.6.  It  was 
shown  in  Section  3.7  that  this  traditional  schedule  usually  is  not  optimal  for  a  finite 
number  of  identical  customers,  but  it  was  not  clear  how  much  savings  could  be 
achieved.  If  Bailey’s  dictum  that  “a  doctor’s  time  is  37.5  times  more  valuable  than 
the  patients’  ”  [7]  is  followed,  c4/(c4  4-  C3)  =  0.974,  leading  to  the  optimal  schedule 
(0,0,1)  for  the  chosen  lattice  size.  The  cost  of  this  coarse  lattice  optimum  is  very 
close  to  that  of  the  continuous  optimum,  but  40%  less  than  that  of  the  ubiquitous 
equi-spaced  schedule,  regardless  of  the  number  of  schedule  slots.  It  is  clear  that 
schedule  optimization  can  lead  to  substantial  cost  improvements,  even  when  fixing 
the  sequence  of  customers. 

Rather  than  an  equi-spaced  schedule,  Charnetski  [21],  Weiss  [164],  and  others 
sought  the  schedule  that  balances  expected  waiting  times  of  each  customer  (except 
the  first).  Such  a  schedule  would  have  the  advantage  that  no  customer  would  perceive 
a  benefit  to  choosing  a  particular  position  in  the  customer  sequence  (other  than  the 
first  appointment  of  the  day).  As  it  stands,  many  customers  perceive  a  waiting-time 
advantage  to  being  scheduled  early  in  the  customer  sequence  [13],  and  this  perception 
is  frequently  correct.  To  give  an  example,  for  a  3-customer,  101-slot  problem  with 
A  =  0.3,  the  overtime  point  equal  to  the  horizon,  all  cost  coefficients  equal  to  1.0,  all 
show  probabilities  equal  to  1.0,  and  all  services  iid  Exp(l.O),  the  results  are  shown 
in  Table  2. 

The  cost  of  the  schedule  that  comes  closest  to  balancing  waiting  times  in  this  case 
exceeds  that  of  the  globally  optimal  schedule  by  23%.  The  end  user  must  determine 
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Table  2.  Comparison  of  results  when  balancing  waiting  times  to  results  when  min 
imizing  the  sum  of  waiting  times.  This  is  a  3-customer,  101-slot  example 
with  A  =  0.3,  Tv  =  rh  =  30.0,  c2  =  c3  =  C4,  and  all  services  exponential 


with  mean  of  1.0. 


T\ 

T2 

13 

w2 

W3 

overtime 

cost 

optimal 

0 

11.1 

24.6 

0.330 

0.460 

1.003 

1.79 

balanced 

0 

3.0 

16.2 

0.719 

0.728 

0.735 

2.20 

whether  a  near-balanced  schedule  would  reduce  customer  or  server  dissatisfaction 
and  whether  this  is  a  reasonable  price  to  pay  to  do  so. 

This  effort  will  concentrate  on  the  globally  optimal  solution.  A  number  of 
propositions  require  proof  before  proceeding  with  the  proposed  algorithm,  beginning 
with  the  issue  of  convexity. 

4.1  Convexity  of  the  Cost  Function 

Wang  considered  a  linear  cost  function  of  the  expected  waiting  times  and  total 
service  time  for  which  7l  =  •  •  •  =  =  1,  the  scheduling  horizon  is  unconstrained, 

and  all  cost  functions  are  linear  [160].  He  proved  the  cost  function  under  these 
conditions  satisfies  strong  stochastic  convexity  with  respect  to  the  interarrival  vec¬ 
tor,  [  r2,  7-3  —  t2,  t4  -  t3,  •  •  • ,  rN  -  tn. 1  ],  and  service  vector.  Shantikumar  et 
al  define  a  function  to  be  strongly  stochastically  convex  if  (in  essence)  it  is  convex 
almost  surely.  They  pointed  out  that  strong  stochastic  convexity  implies  stochastic 
convexity  [143],  and  Wang  used  this  result  to  obtain  stochastic  convexity  for  his  cost 
function.  In  this  section,  Wang’s  argument  is  modified  slightly  to  prove  convexity  of 
C(t)  for  the  cost  function  proposed  in  Equation  (1)  with  respect  to  r. 

Theorem  1  Assume  the  elements  of  r  are  independent  of  each  other,  with  the  ex¬ 
ception  that  0  <  n  <  r2  •  •  •  <  tjv+i,  and  tn+1  may  or  may  not  be  fixed.  Assume  the 
elements  of\  are  independent  of  each  other  and  of  r .  Then  C(r)  =  Ejlt1  ciJE,[Wi('r)] 
is  a  convex  function  of  r  and  x  for  j  <  N  +  1. 
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Proof-.  Arbitrarily  fix  the  service  vector  x  and  the  no-show  vector  9.  A  recur¬ 
sive  argument  will  be  used  to  show  that  if  W3,  conditioned  on  whether  customer  j 
showed,  is  convex,  then  so  is  WJ+U  conditioned  on  whether  customers  j  and  j  +  1 
showed.  The  starting  point  will  be  j  =  1,  since  Wx  =  0  is  convex  with  respect  to  r, 
regardless  of  the  value  of  6\. 

Equation  (2)  will  now  be  modified  to  account  for  no-shows.  As  was  done  in 
Section  3.1,  just  for  the  purposes  of  calculating  W)+i|(0i+i=i)i  one  can  picture  that 
customer  j  always  showed,  waited  for  service,  and  then  was  served  instantaneously 
with  probability  1  —  7 j,  otherwise  undergoing  service  of  length  Xj ■ 

VTj+1|(^+1=i^=i)  =  max  [0,  W3 |(*,.=i)  +Xj  ~  D+i  +  d] 

Wi+i\(ej+1= 1^=0)  =  max  [o,  W3\(e3=i)  ~rj+ 1  +  t3]  (18) 

=>  Wj+  i|(«i+1=i)  =  7 j  max  [0,  W3\{ej=i)  +Xj  ~  Tj+i  +  Tj] 

+  (1  -  7 j)  max  [0,  Wj\(9j=i)  ~tj+i  +  ri] 

Since  max(:r)  is  a  convex,  nondecreasing  function,  Equation  (18)  is  a  convex 

function  of  r  and  x  if  is  a  convex  function  of  T  and  X  t134;  Theorem  5‘1]' 

It  follows  by  mathematical  induction  that  Wj |(eJ=i)  is  convex  for  all  j  and  for  fixed 

9  and  x- 

Since  W3  =  Xj  (^1(^=1)):  it  also  is  convex.  E[W3{t)\  may  be  thought  of 
as  a  convex  combination  of  the  W3  for  each  possible  combination  of  x  and  9,  if 
is  W3(x)df3(xj)>  the  Riemann- Stielt j es  integral  defining  the  expectation  of  the  jth 
waiting  time,  exists  for  each  j.  This  integral  exists  and  is  continuous  with  respect 
to  r  if  each  of  the  service  PDFs,  are  of  bounded  variation  and  if  W3{r)  is 

continuous  with  respect  to  x  and  r  [4:  Theorem  7.38],  Continuity  of  W3(r)  is  assured 
over  x  G  and  over  0  =  n  <  r2  •  •  •  <  rN  <  tn+ 1,  since  Equation  (18)  holds  and 
is  dependent  on  the  maxima  of  a  set  of  continuous  functions  of  r  and  x-  Thus,  the 
restriction  is  that  each  fj(Xj)  t>e  of  bounded  variation.  If  they  are,  then  E{W3(t)\  is 
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a  convex  function  of  r,  for  all  r  that  maintain  the  order  of  scheduled  arrivals.  Note 
that  E[Wj(r)]  is  not  stochastic,  so  stochastic  convexity  need  not  be  invoked. 

If  the  cost  is  a  convex,  nondecreasing  function  of  each  E[Wj{r)\,  then  C(r)  is 
also  convex  with  respect  to  t  [134:  Theorem  5.1].  In  Equation  (1),  the  cost  is  taken 
as  a  (multiple  of  a)  convex  combination  of  each  E[Wj(r )],  so  the  theorem  is  proved. 

f.2  Modification  of  Simeoni’s  Approach 

Simeoni  examined  the  case  of  lattice  arrival  times,  iid  Erlang  services,  equal 
weightings  for  each  customer’s  expected  waiting  time,  a  fixed  time  horizon,  and  with¬ 
out  the  possibility  of  no-shows.  He  proposed  an  efficient  coordinate  search  algorithm 
that  is  extended  here  to  include  independent,  general  service  distributions,  cost  func¬ 
tions  that  are  convex  combinations  of  the  expected  waiting  times,  and  allowance  of 
no-shows. 

Lemma  2  Consider  an  arbitrary  schedule  Si.  Create  S2,  in  which  all  customers 
arrive  at  the  same  times  as  in  Si,  with  the  exception  that  customer  i  arrives  at 
rt  +  6 i  instead  ofrt.  Select  j  >  i,  and  create  5]  (S'2),  in  which  all  customers  arrive 
at  the  same  times  as  in  Si  (S2),  with  the  exception  that  customer  j  arrives  at  Tj  +  8j 
instead  of  Tj.  Assume  8 ,  <  Tj+i  —  t. \  and  8j  <  Tj+ 1  —  Tj,  so  that  service  order  is  the 
same  in  each  of  the  four  schedules  (shown  schematically  in  Figure  8).  Then 

C(Si)  +  C(S'2)<C(S2)  +  C(S[)  (19) 

Proof:  Arbitrarily  fix  the  service  vector  x  and  the  no-show  vector  8.  Since  Si 
and  S[  (S2  and  S2)  differ  only  by  the  arrival  time  of  customer  j,  it  follows  that 

Wn(Si)  -  Wn(S[ )  =  Wn(S2)  -  Wn(S'2)  =  0,  n  =  2, 3, . . . ,  j  -  1  (20) 
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Figure  8.  Arrival  time  schema  for  Lemma  2 

1 

It  is  apparent  that  Wj(S2)  >  IFj(Sj),  since  idle  server  time  on  the  interval  [r, ,  r,  +  <5,] 
under  schedule  £2  could  be  productively  employed  by  customer  i  under  Si .  Likewise, 
Wj(S2)  >  Wj(S[).  It  follows  that  there  is  potentially  more  idle  server  time  on  the 
interval  [Tj,Tj  +  6j]  under  S\  than  on  the  same  interval  under  S 2.  Hence,  the  decrease 
in  waiting  time  for  customer  j  realized  by  changing  from  S\  to  S[  can  be  no  greater 
than  the  decrease  realized  by  changing  from  S2  to  S'2.  Furthermore,  the  increase  in 
waiting  time  for  any  customer  n  >  j  realized  by  changing  from  Si  to  S[  must  be  at 
least  as  great  as  the  increase  realized  by  changing  from  S2  to  S2.  Therefore, 

Wn(£i)  -  Wn(S[)  <  Wn(S2)  -  Wn(S'2 ),  n  =  j, . . . ,  N  +  1  (21) 

Since  the  above  equations  are  true  for  each  possible  combination  of  x  and  Q,  and 
£'[l/F„(5)]  is  just  a  linear  combination  of  these  possibilities, 

E[Wn(Si)}  +  E[Wn(S'2)]  <  E[Wn(S2)]  +  E[Wn(S[)],  n  =  j,...,N  +  1,  (22) 

and  the  desired  result  follows.1  " 

1This  proof  is  presented  in  Vanden  Bosch,  Dietz,  and  Simeoni  [158]  and  is  due  mainly  to  Dietz. 
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Lemma  2  is  proved  for  continuous  arrival  times  -  i.e.,  8l  and  may  take 
on  any  positive  values  that  maintain  the  same  order  of  scheduled  arrivals.  In  the 
remainder  of  this  section,  for  clarity  of  exposition,  it  is  assumed  that  arrival  times 
are  constrained  to  the  evenly  spaced  lattice  points  A: A,  with  k  positive  integral  and 
A  positive  real.  Later,  an  algorithm  will  be  presented  that  approximates  the  optimal 
schedule  when  arrival  times  are  not  restricted  to  lattice  points. 

The  vector  function  C(t)  is  called  submodular  if 

C{xVy)  +  C(x  Py)  <  C(x)  +  C(y)  Vx,y  (23) 

Define  x  V  y  and  x  A  y  as  the  component- wise  maximum  and  minimum  of  x  and  y, 
and  define  piecewise  submodularity  to  be  submodularity  over  the  range  of  x  and  y 
that  retain  the  same  component  orderings  of  the  two  vectors. 

Theorem  3  C(r)  is  submodular  over  t,  <  r2  •  •  •  <  T/v-i  <  tn- 

Proof:  Let  J  be  a  set  of  the  customers  whose  scheduled  arrivals  in  Si  are 
shifted  later  in  S[.  Let  I  be  a  set  of  customers  whose  scheduled  arrivals  in  S2  (S'2) 
are  shifted  later  than  in  Si  ( S[ ).  Without  losing  generality,  let  If)J  =  0-  Since 
5/  A  s2  =  Si  and  5;  v  S2  =  S'2,  the  goal  is  to  show  that,  for  any  choice  oiSu  I,  J, 

and  the  size  of  each  individual  shift, 

C(Si)  +  C(S'2 )  <  C(SJ)  +  C(s2)  (24) 

If  either  /  or  J  are  empty,  Equation  24  follows  trivially,  with  equality  holding.  If  I 
and  J  are  each  of  cardinality  1,  the  theorem  is  simply  a  restatement  of  Theorem  2. 
For  other  cases,  proceed  recursively. 

Assume  the  statement  is  true  for  sets  I  and  J,  each  of  cardinality  less  than  or 
equal  to  Nj.  Choose  I  and  J  of  cardinality  Nj  and  Nj,  respectively,  with  Nj  <  Nj. 
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Then 


C(S1)-C(S'1)<C(S2)-C(S'2)  (25) 

Arbitrarily  choose  a  customer  h,  and  create  S3  (S3)  from  S2  (S2)  by  shifting  customer 
h's  arrival  later  by  some  (equal)  amount  that  does  not  alter  customer  arrival  order. 
Let  the  list  H  consist  of  the  single  customer  h.  Again  by  the  assumption, 

C(S2)  -  C(S’2)  <  C(S3)  -  C(S')  (26) 

Combination  of  the  above  equations  yields 

CW-CtSD^CW-Cffi).  (27) 

This  is  the  statement  of  Equation  24  for  list  I  of  cardinality  Nj  +  1  and  list  J  of 
cardinality  less  than  or  equal  to  Nj.  The  same  approach  may  be  used  to  increase 
the  cardinality  of  J  and  show  the  theorem  holds  if  both  /  and  J  are  of  cardinality 
less  than  or  equal  to  Nj  +  1.  Thus,  the  proposition  is  proved  for  all  I  and  J  by 
mathematical  induction,  and  submodularity  is  established.  “ 

Call  the  binary  vector  relation  ■<  “earlier  than”  and  define  it  as  follows:  x  A  y 
if  and  only  if  vectors  x  and  y  are  both  the  same  length  (call  it  N  + 1)  and  xr  <yr7i: 
i  6  [0,  N  +  1].  If,  in  addition,  x  7^  y,  then  x  ~<  y.  The  relations  y  and  >-  are  defined 
similarly. 

The  following  two  theorems  are  the  heart  of  the  scheduling  algorithm  to  be 
proposed,  allowing  efficient  fathoming  of  the  solution  space.  Fathoming  is  defined 
as  ensuring  that  some  region  of  the  solution  space  does  not  contain  the  optimal 
solution. 

Theorem  4  Suppose  there  is  a  schedule  S2  that  fathoms  all  later  schedules  that 
maintain  the  same  customer  order.  Suppose  that  A  S2  and  that  C(S\)  <  C^)- 
Then  Si  fathoms  all  later  schedules  that  maintain  the  same  customer  order. 
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Proof :  Shift  the  arrivals  of  an  arbitrary  set  of  customers  in  S2  to  arbitrary  later 
times  (without  disturbing  customer  order)  to  create  S'2.  Create  S[  from  S'2  by  shifting 
the  arrivals  of  each  customer  by  Si  -  S2.  Then  by  submodularity,  C(Si)  -  C(S[)  < 
C(S2 )  -  C(S2).  However,  S2  A  S'2,  so  C(S2)  -  C(S2)  <  0,  and  it  follows  that 
C(Si)  —  C(S,1)  <  0.  Therefore,  Si  fathoms  all  later  schedules.  “ 

The  need  for  Theorem  3,  over  and  above  Lemma  2,  is  easy  to  overlook  [145], 
An  example  makes  its  importance  clear.  Suppose  that 

S2  =  [0  2  3  4] 

Si  =  [0  1  3  4] 

S'=  [0  1  4  5] 

Since  it  is  not  true  that  S2  -<  S',  it  is  not  clear  whether  S2  fathoms  S',  given  the 
assumptions  of  Theorem  4.  Suppose  that  C(Si)  <  C(S2).  Theorem  2  ensures  that  Si 
fathoms  later  schedules  that  can  be  constructed  from  S2  by  moving  one  arrival  time 
later  and  one  earlier,  but  construction  of  S'  requires  moving  one  arrival  time  earlier 
and  two  arrival  times  later.  Despite  the  fact  that  Si  A  S',  there  is  no  assurance  that 
C(Si)  <  C(S')  without  Theorem  3. 

Theorem  5  Suppose  there  is  a  schedule  S2  that  fathoms  all  earlier  schedules  that 
maintain  the  same  customer  order.  Suppose  S2  is  formed  from  S2  by  shifting  the 
arrival  time  of  customer  j  an  amount  A  later  and  that  C(S2)  <  C{S2).  Then  S2 
fathoms  all  earlier  schedules  that  maintain  the  same  customer  order. 

Proof:  The  proof  parallels  that  of  Theorem  4.  “ 

The  last  two  theorems  immediately  suggest  a  strategy  for  finding  the  optimal 
schedule.  Let  K  be  the  number  of  time  slots  in  the  schedule. 
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Fixed-Lattice  Algorithm 


1.  Establish  an  early  incumbent  schedule,  SE-  If  no  better  bound  is  available,  use 

sE  =  [  0  0  •••  0  ]. 

2.  Let  m  be  the  largest  integer  for  which  it  holds  that  customer  m  in  SE  is  not 
scheduled  at  t  =  (K  —  1)A. 

3.  Establish  a  candidate  early  schedule  S  by  shifting  the  arrival  time  of  customer 
m  in  SE  one  time  slot  (A)  later,  unless  this  shift  causes  the  order  of  customer 
arrivals  to  change.  If  all  customers  but  the  first  are  scheduled  at  t  =  (K  - 1)  A, 
stop.  (Recall  that  the  first  customer’s  arrival  time  is  fixed  at  0.) 

4.  If  C(S)  <  C(Se),  let  SE  =  S  and  return  to  step  2. 

5.  If  m  >  2,  decrement  m  and  return  to  step  3.  Otherwise,  each  customer  of  the 
current  SE  has  shifted  without  improvement,  and  SE  is  fixed. 

The  algorithm  for  establishing  Sl,  the  late  incumbent  schedule,  parallels  the 
above.  In  finding  Sl,  if  no  better  initial  late  bound  is  available,  one  should  use  the 
latest  possible  feasible  schedule,  in  which  T\  =  0  and  all  other  scheduled  arrival  times 
are  at  tn+ x.  However,  once  SE  is  found,  Theorem  8  below  will  provide  a  far  more 
efficient  initial  bound  for  Sl- 

The  example  in  Table  3  applies  the  algorithm  twice  to  a  scheduling  problem 
with  three  customers  and  six  slots.  In  nine  evaluations,  SE  is  found  to  be  [0,1,3], 
and  in  eight  more  (two  of  which  have  already  been  evaluated),  Sl  is  found  to  be 
[0,2,4],  with  a  slightly  lower  cost  than  SE- 

A  unique  property  of  the  algorithm  is  seen  in  the  decision  at  iteration  6  of  the 
early  algorithm  to  abandon  an  apparently  profitable  search  direction.  A  substantial 
cost  reduction  had  just  been  achieved  by  shifting  the  second  customer  one  slot, 
but  the  algorithm  does  not  continue  to  shift  the  second  customer,  as  most  search 
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strategies  would  do.  This  apparent  inefficiency  is  necessary  in  order  to  ensure  all 
earlier  schedules  are  fathomed. 

The  early  and  late  fixed-lattice  algorithms  together  fathom  most  of  the  solution 
space  for  this  example.  However,  it  is  not  yet  proved  that  [0,2,4]  is  the  global 
optimum,  since  [0,1,5]  has  not  been  evaluated  and  is  neither  earlier  than  SE  nor 
later  than  SL.  Three  additional  propositions  will  prove  that  SE  ■<  S  *  SL,  where 
S  is  the  optimal  lattice  schedule,  and  that  for  sufficiently  small  lattices,  SE  and  SL 
are  quite  close. 

Lemma  6  If  Si  X  SE  implies  C{SE)  <  C{Si),  and  S2  X  SL  implies  C(SL)  <  C(S2), 
then  SE  <  Sl- 

Proof-.  It  is  always  possible  to  form  SE  from  SL  by  shifting  a  list  of  scheduled 
arrival  times  /  by  A  earlier  and  a  list  J  by  A  later,  If)J  =  Q-  It  is  also  always  possible 
both  to  form  Si  from  SE  by  shifting  the  scheduled  arrival  times  of  the  customers  in  J 
by  A  earlier  and  to  form  S2  from  SL  by  shifting  the  scheduled  arrival  times  of  the  cus¬ 
tomers  in  I  by  A  later.  Then  by  submodularity,  C(Si)  -  C(SE)  <  C{SL)  -  C(S2). 
However,  since  SE  and  SL  fathom  earlier  and  later  schedules,  C(Si)  -  C(SE )  >  0 
and  C(Sl )  —  C(S2)  <  0.  The  contradiction  can  be  resolved  only  if  either  /  or  J  is 
empty.  This  implies  that  either  SE  *  SL  or  SL  X  SE.  It  is  impossible  to  have 
SL-<  SE,  since  the  two  schedules  would  then  fathom  each  other,  so  SE  X  SL- 

Theorem  7  SE  ^  S  ^  Sl- 

Proof:  Let  S'  be  the  lowest-cost  schedule  in  the  lattice  of  size  A  such  that 
SE  <  S' <  SL-  S'  fathoms  all  schedules  that  are  earlier  than  SE  or  later  than  SL,  as 
well  as  those  between  SE  and  SL.  Then  if  S'  ±  S,  it  must  be  that  S  contains  some  list 
I  of  customers  whose  scheduled  arrival  times  are  earlier  than  their  scheduled  arrival 
times  in  SE.  Likewise,  S  must  contain  some  list  J  of  customers  whose  scheduled 
arrival  times  are  later  than  their  scheduled  arrival  times  in  SL.  Form  Si  from  SE 
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Table  3.  Determination  of  optimal  early  and  late  schedules 


Early  schedule  algorithm 


iteration 

schedule 

cost 

improvement? 

1 

[0,  0,  0] 

3.762 

Y 

2 

[0,  0,  1] 

2.819 

Y 

3 

[0,  0,  2] 

2.163 

Y 

4 

[0,  0,  3] 

1.854 

Y 

5 

[0,  0,  4] 

1.903 

N 

6 

[0,  1,  3] 

1.195 

Y 

7 

[0,  1,  4] 

1.215 

N 

8 

[0,  2,  3] 

1.216 

N 

Late  schedule  algorithm,  starting  at  latest  feasible  schedule 


iteration 

schedule 

cost 

improvement? 

1 

[0,  5,  5] 

3.525 

Y 

2 

[0,  4,  5] 

2.137 

Y 

3 

[0,  3,  5] 

1.623 

Y 

4 

[0,  2,  5] 

1.537 

Y 

5 

[0,  1,  5] 

1.766 

N 

6 

[0,  2,  4] 

1.069 

Y 

7 

[0,  1,  4] 

1.215 

N 

8 

[0,  2,  3] 

1.216 

N 
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by  shifting  the  arrival  times  of  customers  in  I  earlier  by  A,  and  form  S2  from  SL  by 
shifting  the  arrival  times  of  customers  in  J  later  by  A.  It  follows  that  Si  -<  S  -<  S2 
and  Si  -<  S'  -<  S2.  Then,  by  Theorem  3, 

C(Si)  +  C(S2)  <  C(S')  +  C(S).  (28) 

Since  S\  -<  Se ,  C(Se )  <  and  since  C(S')  <  C(Se )  by  definition,  then 

C(S')  <  C(S'i).  Also  by  definition,  C(S)  <  C(S2),  so 

C(Si)  +  C(S2)  >  C(S')  +  C(S).  (29) 

The  contradiction  between  Equations  (28)  and  (29)  can  only  be  resolved  by  conclud¬ 
ing  S  cannot  be  distinct  from  S' ,  which  lies  between  Se  and  Sl.  " 

Lemma  6  and  Theorem  7  guarantee  that  if  the  inequality  at  step  4  of  the 
fixed-lattice  algorithm  is  strict,  then  SE  A  S  S  SL.  If,  at  some  point,  equality  is 
observed  at  step  4  for  each  possible  direction  of  improvement,  the  algorithm  will  stop. 
This  is  appropriate  for  convex  functions  such  as  the  proposed  cost  function,  since  a 
“flat  spot”  implies  that  the  optimum  cost  has  been  reached;  if  C(SE)  =  C(SL),  all 
schedules  between  SE  and  SL  are  optimal. 

Remarkably,  the  algorithm  as  presented  so  far  will  also  obtain  bound  the  op¬ 
tima  of  a  non-convex  submodular  function  by  Se  and  Si.  Of  course,  the  quality 
of  the  bounds  depends  on  the  nature  and  location  of  the  nonconvexities;  two  local 
minima  far  apart  ensure  the  bounds  will  also  be  far  apart,  since  the  algorithm  cannot 
“pass  through”  such  areas.  Topkis  proved  that  the  set  of  minima  of  a  submodular 
function  defines  a  sublattice,  and  that  result  could  be  useful  in  the  search  for  minima 
within  these  bounds  [155,  156]. 

The  following  theorem  establishes  the  efficacy  of  the  bounds  obtained  only  for 
convex  functions,  such  as  the  proposed  cost  function. 
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Theorem  8  Suppose  C(SE)  ^  C{SL).  Further  suppose  that  either  no  two  customers 
in  SE  or  that  no  two  customers  in  SL  share  the  same  arrival  time.  Then  for  each  j, 
Tj  differs  between  Se  and  Sl  by  at  most  A. 

Proof:  Assume  the  opposite  of  the  conclusion;  suppose  the  scheduled  arrival 
time  of  customer  i  had  to  be  shifted  more  than  A  from  its  scheduled  arrival  time 
in  SE  to  reach  its  corresponding  place  in  SL.  Let  Si  be  the  schedule  formed  from 
SE  by  shifting  i  one  time  slot  later.  (The  second  condition  of  the  theorem  ensures 
this  shift  does  not  create  an  infeasible  schedule.)  It  must  be  that  C(Si)  >  C(Se ), 
or  else  the  algorithm  for  obtaining  Se  would  have  found  Si  as  the  early  incumbent 
schedule.  Likewise,  (7(50  >  C(SL).  Since  SE  -<  Sl  -<  SL,  convexity  is  violated 
if  C(Si)  >  C(SE)  or  C(Si)  >  C(SL).  Therefore,  C(SE)  =  C(Si)  =  C(SL).  This 
proves  the  contrapositive  of  the  theorem,  so  the  theorem  itself  is  also  true. 

It  may  be  that  C(SE )  =  C(SL)  for  a  convex  function.  If  n  differs  in  SE  and 
SL  by  more  than  A  for  some  i,  it  must  be  that  the  function  is  “flat”  between  these 
bounds  with  respect  to  Tj,  since  otherwise  the  algorithm  would  have  continued  to 
find  a  better  SE  or  SL.  The  situation  C(SE )  =  C(SL)  is  exceedingly  rare  and  easily 
recognized  and  remedied  when  it  occurs,  so  it  is  of  minimal  concern.  In  all  other 
cases,  the  algorithm  has  been  proved  to  produce  tight  bounds  on  the  optimum  for 
functions  that  are  convex  and  for  which  Theorem  2  holds. 

Normally,  there  is  no  problem  when  two  customers  occupy  the  same  slot;  the 
conclusion  still  holds.  However,  there  is  the  remote  possibility  in  some  circumstances 
that  in  the  search  for  SE,  a  suboptimal  schedule  will  be  encountered  in  which  two 
adjacent  customers  occupy  the  same  slot  and  for  which  any  shift  results  in  a  higher 
cost.  This  happens  when  the  lattice  size  is  so  extremely  coarse  that  a  single  shift 
of  the  latter  customer  traverses  the  entire  region  of  improvement  and  ends  up  on 
the  “opposite  side”  of  the  convex  surface,  regardless  of  the  positions  of  subsequent 
customers.  When  the  latter  customer  is  shifted,  the  cost  increases,  and  when  the 
former  is  increased,  an  infeasible  schedule  ensues,  forcing  the  search  to  fix  the  arrival 


66 


times  of  these  two  customers  prematurely.  Since  the  same  may  happen  for  Sl  at 
a  very  different  schedule,  the  conclusion  of  Theorem  8  no  longer  is  true,  and  the 
number  of  schedules  to  be  fathomed  can  be  quite  large. 

When  two  customers  are  co-scheduled  in  SE,  a  straightforward  test  to  see  if 
the  above  problem  pertains  is  to  shift  all  customers  2A  later  (if  feasible)  instead  of 
A  and  search  for  SL  from  this  point.  If  the  two  customers  in  question  are  “glued”, 
SE  and  SL  will  differ  by  more  than  A  for  these  customers.  Otherwise,  the  optimum 
can  be  obtained  normally,  at  a  cost  of  at  most  N  —  1  schedule  evaluations.  This 
procedure  is  incorporated  into  the  program  implementing  the  fixed-lattice  algorithm 
provided  in  Section  H.l,  and  the  user  is  alerted  to  any  problem. 

If  a  problem  is  indeed  ascertained,  a  reasonable  approach  is  to  meld  these  two 
customers  into  a  single  customer  and  restart  the  search  for  SE  &t  the  point  that 
the  search  stopped.  This  approach  has  not  been  automated  and  is  left  to  the  user’s 
control. 

It  is  emphasized  that  this  co-scheduling  problem  has  only  been  encountered 
when  using  lattice  sizes  on  the  order  of  or  larger  than  the  customer  mean,  which  is 
unlikely  to  arise  in  actual  situations. 

As  noted  above,  Theorem  8  provides  an  effective  bound  for  SL  once  SE  is  found 
(or  vice-versa).  Instead  of  starting  the  search  for  SL  at  the  latest  possible  schedule, 
one  can  start  at  the  schedule  formed  by  shifting  each  of  the  arrival  times  of  SE  one 
unit  later,  if  feasible.  (It  is  never  feasible  to  leave  the  first  schedule  slot  empty  or  to 
shift  an  arrival  time  past  the  schedule  horizon.)  If  it  is  suspected  that  the  optimum 
schedule  is  closer  to  the  latest  possible  schedule  than  to  the  earliest,  one  may  reduce 
the  number  of  iterations  required  by  finding  Sl  first,  then  finding  SE. 

The  fixed-lattice  algorithm  is  a  sort  of  cyclic  coordinate  search,  in  which  a  series 
of  tests  for  improvement  are  performed  along  the  coordinate  axes  in  a  specified  order. 
However,  as  noted  earlier,  if  there  is  an  improvement,  the  new  candidate  optimum 
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is  immediately  accepted,  and  the  search  in  that  direction  is  halted  in  favor  of  a 
new  direction.  Such  a  search  method  seems  counterintuitive;  it  seems  reasonable 
to  persist  in  a  search  direction  for  as  long  as  an  improvement  is  realized,  as  many 
nonlinear  programs  (NLPs)  do.  The  reason  the  search  is  halted  is  that  Theorems 
4  and  5  are  of  help  in  fathoming  solutions  only  if  changes  of  A  are  made  at  each 
step.  That  this  search  algorithm  is  often  an  improvement  over  other  approaches 
(cf.  Appendix  C)  is  partly  due  to  the  increase  in  schedule  cost  as  customers  are 
scheduled  close  together  during  the  optimization  process;  a  standard  NLP  approach 
such  as  Hooke-Jeeves  cannot  achieve  much  improvement  over  the  cost  function  by 
continuing  for  long  in  any  single  direction,  limiting  the  effectiveness  of  any  line  search 
it  employs. 


4-3  Fixed- Lattice  Examples 

A  better  understanding  of  the  problem  and  solution  algorithm  can  be  gained  by 
considering  some  examples.  Let  I(SE),  I(SL),  and  I(S)  be  the  number  of  iterations 
(schedule  cost  evaluations)  required  by  the  early,  late,  and  enumeration  phases  of 
the  algorithm. 

The  example  from  the  beginning  of  this  chapter  is  considered  first,  in  which 
there  are  5  lattice  points  (slots),  3  customers,  and  services  are  iid  exponential  distri¬ 
butions  with  mean  of  1.  Overtime  commences  at  the  schedule  horizon.  The  optimal 
schedules  were  already  obtained  by  exhaustive  enumeration  of  the  15  possibilities. 
For  the  specific  case  of  02=03  =  04,  the  lattice  algorithm  produces 

SE  =  [  0  1  3]  C(SE)  =  0.9170  I(SE)  =  8 

SL  =  [  0  2  4  ]  C(SL)  =  0.8369  I(SL)  =  2 

S  =  [  0  2  4  ]  C(S)  =  0.8369  I(S)  =  0 
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Here,  SE  was  found  using  the  fixed-lattice  algorithm,  halting  after  8  iterations. 
A  candidate  SE  was  formed  by  shifting  the  second  and  third  customer  arrival  times 
one  slot.  The  algorithm  evaluated  two  additional  schedules,  [0  1  4]  and  [0  2  3],  but  no 
improvement  was  obtained.  Since  SL  and  SE  differed,  the  schedules  between  them 
had  to  be  exhaustively  evaluated.  However,  there  are  only  two  schedules  between 
them,  and  these  already  were  evaluated  during  the  calculation  of  SL.  Thus,  no  new 
schedules  were  evaluated  during  the  enumeration  phase,  and  S  =  SL. 

Figure  7  shows  that  [0  1  3]  and  [0  2  4]  are  equal  in  cost  when  c4/(c4-l-C3)  ~  0.569. 
It  turns  out  that  SE  and  SL  differ  by  two  customers  if  c4/(c4  +  c3)  e  [0.45, 0.62].  For 
c4/(c4  +  c3)  €  [0, 0.45]  U[0.62, 0.92],  SE  =  SL. 

For  this  small  example,  the  algorithm  provides  little  improvement  over  full 
enumeration  of  the  schedules.  Extending  the  number  of  schedule  slots  to  10  and  the 
number  of  customers  to  11  (while  maintaining  the  same  service  mean)  produces 

Se  =  [01234567899]  C(SE)  =  16.26  I(SE)  =  153 

Sl  =  [0  1  234678999]  C(SL)  =  16.02  I(SL)  =  18 
£=[01234678999]  C(S)  =  16.02  I(S)  =  6 


Here,  177  schedules  were  evaluated  out  of  the  92,378  possible,  showing  the 
potential  of  the  fixed-lattice  algorithm. 

An  extreme  example  with  respect  to  computation  is  seen  with  20  slots,  15 
customers,  c2  =  •  •  •  =  c16  =  1,  and  an  Erlang-4  service  distribution  with  mean  of 
2 A.  The  algorithm  produces 

SE  =  [  0  1  3  5  7  9  11  13  15  17  19  19  19  19  19  ]A 
C(SE)  =  51.06  I(SE )  =  458 
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SL 

s 


[  0  2  4  6  8  10  12  14  16  18  19  19  19  19  19  ]A 
C(SL)  =  50.72  I(SE)  =  10 


[  0  2  4  6  8  10  12  14  16  18  19  19  19  19  19  ]A 
C(S)  =  50.72  I(S)  =  492 


In  this  case,  the  majority  of  the  evaluations  are  consumed  in  resolving  the  small 
difference  between  SE  and  SE ,  eventually  obtaining  no  improvement.  It  is  very  rare 
that  Se  and  Se  differ  in  nearly  half  their  slots.  Even  with  such  a  large  enumeration 
phase,  only  960  of  the  possible  818,809,200  schedules  required  evaluation. 

Simeoni  conjectured  that  S,  the  lattice  optimum,  always  coincided  with  either 
Sl  or  SE  [145].  A  counterexample  is  the  case  in  which  12  customers  are  to  be 
scheduled  into  10  slots.  No-shows  are  not  allowed,  and  services  are  iid  exponential 
with  mean  equal  to  A.  Costs  are  linear,  overtime  commences  at  the  schedule  horizon, 
and  c2  =  •  •  •  =  C13  =  1.  For  this  case, 

SL  =  [0  1  2  3  4  5  6  8  9  9  9  9] A  C(SL)  =  21.31  I(SL)  =  108 

SE  =  [0  0  1  3  4  5  6  7  8  9  9  9]  A  C(SE)  =  21.10  I(SE)  =  27 

5  =  [0  1234567899  9]  A  C(S)  =  21.04  I(S)  =  20 

C(SL)  and  C(SE)  are  quite  close,  preventing  the  algorithm  from  resolving  the  op¬ 
timum  until  the  20  schedules  between  SL  and  SE  are  enumerated.  Each  of  these 
enumerated  schedules  are  also  quite  close  in  cost,  but  only  S  above  is  lower  in  cost 
than  both  SE  and  SL. 

If  SE  and  SL  differ  by  k  arrivals,  there  are  2k  schedules  between  them,  inclu¬ 
sive.  However,  some  of  those  schedules  cause  the  order  of  arrivals  to  change,  and 
others  have  already  been  evaluated  in  the  process  of  finding  SE  and  SL.  The  actual 
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number  of  schedules  required  to  be  enumerated  is  far  less  and  is  discussed  in  detail 
in  Appendix  C. 

If  Se  and  Si  do  differ,  the  likelihood  is  very  high  that  the  optimum  is  one  of 
these  two  schedules  (cf.  Appendix  C).  For  instance,  in  a  series  of  3000  optimizations 
of  10-customer,  21-slot  schedules  with  parameters  chosen  randomly  from  a  realistic 
set,  each  of  the  optima  coincided  with  either  Se  or  Si.  When  the  optimum  does 
not  coincide  with  either  Si  or  Se,  it  appears  very  likely  that  the  optimum  schedule 
differs  by  only  two  customers  from  either  SL  or  SE,  as  in  the  counterexample.  No 
case  of  the  optimum  differing  from  both  SE  and  SL  by  more  than  two  customers  has 
been  observed,  although  such  a  situation  cannot  be  ruled  out. 

The  above  examples  show  that  C(SE)  ~  C(SL )  «  C(S).  Thus,  in  cases  in 
which  suboptimal  solutions  are  acceptable,  one  may  decide  just  to  obtain  SL  or 
SE  and  use  it  as  a  suboptimal  solution.  One  is  assured  by  Theorem  8  that  each 
arrival  time  in  this  approximation  is  within  A  of  its  optimum.  In  each  of  the  rare 
cases  observed  in  which  the  optimum  was  neither  SE  nor  SL,  the  cost  improvement 
obtained  by  selecting  the  global  optimum  over  those  schedules  was  less  than  0.5%. 


4-4  Algorithms  for  Finding  the  Optimal  Fine-Lattice  or  Continuous  Schedule 

A  natural  extension  to  the  fixed-lattice  algorithm  is  first  to  apply  the  fixed- 
lattice  algorithm  using  a  coarse  lattice  in  an  effort  to  fathom  schedules  under  a 
finer  lattice  more  efficiently.  Such  an  algorithm  might  also  be  employed  to  obtain 
increasingly  better  approximations  to  the  unconstrained  (continuous)  optimum  by 
setting  the  lattice  size  successively  smaller.  Such  an  algorithm  was  attempted  by 
Simeoni  [145].  His  success  was  limited  by  the  lack  of  a  way  to  obtain  efficient 
bounds  that  apply  when  the  lattice  size  is  altered.  (That  SE  and  SL  are  no  longer 
necessarily  bounds  under  a  new  lattice  size  is  immediately  apparent  if  one  considers 
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the  possibility  that  SE  =  SE.)  This  lack  is  remedied  by  the  following  four  corollaries 
to  Theorem  8. 

Corollary  9  Suppose  SE  is  obtained  for  arrival  times  constrained  to  a  lattice  of  size 
A.  Let  S'  be  the  optimum  schedule  for  the  same  problem  with  lattice  size  A'  such 
that  A/ A'  is  a  positive  integer.  Let  be  a  vector  of  N  +  1  elements,  all  equal  to 
unity.  Then  S'  X  SE  +  A^N+1. 

Proof:  For  these  particular  values  of  A',  SE  +  A^N+1  lies  in  the  new  lattice 
system,  so  C(S’)  <  C  ( SE  +  A'l/jv+i)-  In  addition,  C(S)  <  C  (SE  +  A'Fyv+i),  where 
S  is  the  optimum  under  lattice  size  A.  If  the  conclusion  is  assumed  false,  then 
S  ^  SE  +  A'Fat+i  -<  S'.  However,  in  contradiction  with  Theorem  1,  the  cost  function 
cannot  be  convex  now,  since  a  maximum  exists  on  the  line  segment  connecting  S, 
SE  +  Atkyv+i,  and  S',  and  that  maximum  is  not  at  an  endpoint.  The  assertion  is 
thus  proved  indirectly. 

Corollary  10  Suppose  Sl  is  obtained  for  arrival  times  constrained  to  a  lattice  of 
size  A.  Let  S'  be  the  optimum  schedule  for  the  same  problem  with  lattice  size  A' 
such  that  A/A'  is  a  positive  integer.  Then  S'  y  Sl  —  A\k;v+i- 

Proof:  Parallels  that  of  Corollary  9.  “ 

Corollaries  9  and  10  proved  that  if  A'  is  an  integral  fraction  of  A,  then  SL-  A  X 
S'  ■<  SE  +  A.  The  same  bounds  hold  for  the  continuous  optimum,  S. 

Corollary  11  SL  -  A'Fjv+i  A  S  ■<  SE  +  A^+1]. 

Proof:  Parallels  those  of  Corollaries  9  and  10.  “ 

If  A'  is  not  an  integral  fraction  of  A,  the  proofs  for  Corollaries  9  and  10  fail, 
since  the  relationship  between  the  two  optimum  lattice  schedules  is  unclear;  in  some 
cases,  a  smaller  lattice  leads  to  a  larger  optimal  cost.  A  further  relaxation  of  the 
search  bounds  is  required. 
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Corollary  12  Suppose  SE  and  Sl  bound  the  optimum  under  lattice  size  A.  Let  S' 
be  the  optimum  schedule  for  the  same  problem  with  lattice  size  A1  such  that  A1  <  A. 
Then  SL  -  (A  +  A')tf  *+i  ±S'PSE-( A  +  A')tf  jv+i]. 

Proof :  By  Corollary  11,  SE  and  Si  must  lie  at  most  A  from  S.  Likewise,  S'E 
and  S'L,  the  bounds  under  lattice  size  A',  must  lie  at  most  A'  from  S.  Then  S'E  lies 
at  most  A  +  A'  from  Sl,  and  S'L  lies  at  most  A  +  A'  from  SE.  Since  SE  <  Sl,  and 
S'E  A  S'  zf  S'L,  the  desired  result  follows.  " 

The  rough  bounds  obtained  when  reducing  the  lattice  size  by  an  integral  frac¬ 
tion  are  substantially  better  than  those  in  the  general  case,  and  one  should  take 
advantage  of  this  property  whenever  possible.  However,  sometimes  the  choices  of 
lattice  size  are  limited;  one  may  have  to  resort  to  a  non-integral  reduction  in  lattice 
size.  An  algorithm  for  obtaining  the  exact  optimum  for  a  fine  lattice  size  A  follows. 

Variable-Lattice  Algorithm 

1.  Choose  a  coarse  lattice  size,  A'.  When  feasible,  make  A'  a  multiple  of  A.  Set 
S'E,  the  lower  bound  on  the  optimum  schedule  under  lattice  size  A',  equal  to 
[0,0, •••,0], 

2.  Improve  S'E  using  the  fixed-lattice  algorithm,  with  the  current  value  of  SE  as 
a  starting  point. 

3.  Choose  A"  such  that  A  <  A"  <  A' .  When  feasible,  choose  A"  such  that  A'  is 
a  multiple  of  A",  or  A"  is  a  multiple  of  A,  or  (preferably)  both. 

4.  If  A'  is  a  multiple  of  A",  let  S'f  =  S'E  +  A'^jv+i.  Otherwise,  let  S'f  =  S'E  + 
(A'  +  A")^7v+i.  S'l  is  an  upper  bound  on  the  optimum  schedule  under  lattice 
size  A". 

5.  Improve  S'[  using  the  fixed-lattice  algorithm,  with  the  current  value  of  S'[  as 
a  starting  point. 
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6.  Choose  A'  such  that  A  <  A'  <  A".  When  feasible,  choose  A'  such  that  A"  is  a 
multiple  of  A',  or  A'  is  a  multiple  of  A,  or  (preferably)  both,  to  take  advantage 
of  the  improvement  in  bounds  under  Corollaries  9  and  10. 

7.  If  A"  is  a  multiple  of  A',  let  S'E  =  S'[  -  A"#N+1.  Otherwise,  let  S'E  = 

S£-(A"  +  A')<W 

8.  If  A"  ±  A,  go  to  step  2.  Otherwise,  stop;  S'E  and  S'[  are  bounds  to  the 
optimal  solution  under  A.  Since  S'[  ■<  SE  +  A^N+1,  the  bounds  are  close,  and 
an  exhaustive  search  is  feasible. 

For  an  approximation  to  the  unconstrained  optimum,  set  A  to  the  desired 
accuracy  and  apply  the  variable-lattice  algorithm.  Initial  experiments  indicate  that, 
for  obtaining  the  unconstrained  optimum,  the  lattice  size  should  be  decreased  at 
each  step  by  a  factor  of  at  least  four;  smaller  reductions  do  not  appear  effective  at 
reducing  the  number  of  cost  evaluations  required.  One  should  start  with  a  lattice 
with  at  least  four  points  within  the  time  horizon,  or  else  the  added  problem  setup 
time  will  more  than  offset  any  savings  in  time.  When  using  the  algorithm  to  obtain 
the  exact  optimum  for  a  fine  lattice,  these  rules  are  mitigated  by  the  better  bounds 
obtained  if  integral  reductions  are  chosen. 

4-5  Variable-lattice  Example 

As  an  example,  consider  a  problem  with  5  customers  and  301  schedule  slots. 
The  services  are  all  Erlang(2)  with  mean  of  1,  and  customers  always  show.  The 
schedule  horizon  and  the  overtime  point  are  2  and  the  cost  coefficients  all  equal  to  1. 
The  optimum  schedule  is  [  0  94  226  300  300  ],  which  takes  367  evaluations  and 
776  seconds  to  find  using  the  fixed-lattice  algorithm.  One  variable-lattice  approach 
is  shown  in  Table  4.  Here,  the  approach  is  to  start  the  fixed-lattice  algorithm  at 
K  -  1  =  8  and  quadruple  K  —  1  at  each  iteration  of  the  variable-lattice  approach 
for  each  iteration  except  the  last.  For  the  example,  this  approach  took  a  total  of 
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45.3  seconds,  a  reduction  of  94%  over  the  fixed-lattice  approach.  The  total  number 
of  schedule  evaluations  required  is  70,  but  time  is  a  better  measure  of  effectiveness 
here,  since  the  time  to  perform  a  single  evaluation  is  dependent  on  K. 

Alternate  paths  to  reach  K  -  1  =  300  were  tried,  with  (10,  60,  300),  (15,  75, 
150,  300),  and  (10,  50,  150,  300)  taking  slightly  more  time  than  (8,  32,  128,  300). 
These  paths  were  tailored  to  the  specific  problem  so  that  the  reduction  in  A  is  an 
integer  at  each  iteration.  However,  no  improvement  in  run  time  was  attained  over 
the  selected  generic  approach,  suggesting  that  the  improvement  in  bounds  when  an 
integer  reduction  is  achieved  is  not  worth  pursuing.  For  comparable  problem  sizes, 
a  fourfold  reduction  in  A  for  each  iteration  is  a  reasonable  value. 

For  larger  problems,  initial  experiments  suggest  a  reduction  in  A  of  a  factor 
of  2  for  each  iteration  is  effective.  For  example,  given  the  solution  of  the  above 
problem  for  K  —  1  =  300,  a  subsequent  single  iteration  to  reach  K  —  1  =  1200 
takes  141  seconds,  while  taking  the  path  (300,600,1200)  takes  27  +  76  =  103  seconds. 
An  approach  that  has  proved  effective  in  practice  is  to  switch  from  a  fourfold  to  a 
twofold  reduction  in  A  when  K  >  100. 


Table  4.  Comparison  of  fixed-  and  variable-lattice  results  for  a  sample  problem  with 
N  =  5  and  K  —  1  =  300.  The  schedules  are  in  units  of  the  current  A. 


K- 1 

starting  schedule 

optimal  schedule 

cost 

evals 

time 

8 

[0  0  0  0  0] 

SE:  [0  2  6  8  8] 

7.82195 

7 

2.63 

32 

[0  12  28  32  32] 

SL:  [0  10  24  32  32] 

7.80134 

15 

3.57 

128 

[0  36  92  124  124] 

SE:  [0  40  97  128  128] 

7.80133 

24 

9.18 

300 

[0  97  230  300  300] 

SL:  [0  94  226  300  300] 

7.80127 

16 

20.38 

300 

[0  93  225  299  299] 

SE :  [0  94  226  300  300] 

7.80127 

8 

9.55 

Little  improvement  is  attained  in  the  optimum  as  the  number  of  slots  is  in¬ 
creased.  This  is  generally  true,  and  is  justification  for  limiting  the  lattice  size  in 
practical  problems.  This  limits  the  usefulness  of  the  variable-lattice  algorithm.  The 
reader  should  not  get  the  impression  from  this  that  the  cost  function  is  flat;  the  cost 
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of  the  intuitively  attractive  equi-spaced  schedule  (i.e.,  [  0  60  120  180  240  ]  A  for 
K  —  1  =  300),  is  8.87,  13%  higher  than  the  coarsest  lattice  solution  tabulated. 


4-6  The  Dynamic  Problem 

The  goal  of  this  chapter  has  been  to  solve  the  static  problem:  What  is  the 
optimal  schedule  for  a  given  sequence  if  it  must  be  determined  before  the  start  of 
the  schedule?  In  the  dynamic  problem,  the  schedule  must  be  determined  at  some 
time  td  after  the  start,  given  that  the  realizations  of  the  services  to  that  point  are 
known.  This  section  solves  this  dynamic  problem  by  transforming  it  to  a  static 
problem. 

Two  researchers  have  solved  dynamic  problems,  both  assuming  exponential 
services.  Wang  solved  a  different  dynamic  problem  than  that  above:  if  a  new  cus¬ 
tomer  is  added  to  the  system  at  td,  what  should  be  its  scheduled  arrival  time,  given 
the  scheduled  arrival  times  of  the  other  customers  are  fixed  [160]?  This  is  a  realistic 
problem,  in  that  it  may  not  be  possible  to  reschedule  jobs,  but  an  analytical  treat¬ 
ment  leads  to  little  improvement  in  the  cost  when  compared  to  the  current  practice 
of  “just  sticking  it  in  somewhere”.  Liao  solved  the  dynamic  problem  treated  here 
for  iid  Erlang  service  times  [97],  but  it  is  not  possible  to  extend  his  solution  even  to 
idd  Erlang  services  unless  the  current  phase  is  known  of  each  customer  at  t.  This  is 
seldom  the  case. 

Suppose  that  at  td,  Nc  customers  have  completed  service  and  Ns  are  in  service. 
The  task  is  to  schedule  the  remaining  N  —  Nc  —  Ns  customers  into  the  interval 
[td,Th\.  The  customers  who  completed  service  obviously  do  not  affect  the  optimal 
dynamic  schedule.  However,  the  ones  in  service  pose  the  problem  noted  by  Liao:  if 
the  number  of  phases  completed  thus  far  is  not  known,  the  memoryless  properties  of 
the  exponential  phases  cannot  be  exploited.  To  employ  the  phase- type  distribution 
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approach  to  cost  calculation  recommended  in  Chapter  III,  it  is  necessary  to  revise 
the  state  probability  vector  p(td). 

This  is  a  relatively  straightforward  task,  since  ts ,  the  starting  time  of  customer 
Nc+  1  (the  customer  in  service  at  td),  is  known.  Start  the  cost  algorithm  at  ts,  with 
all  the  probability  mass  in  the  first  phase  of  customer  Nc  +  1,  and  determine  the 
probability  vector  at  td.  This  probability  vector  is  obviously  incorrect,  since  it  gives 
a  nonzero  probability  of  customer  Nc  +  1  and  those  subsequent  having  completed 
service  at  td,  which  is  known  at  td  not  to  be  the  case.  However,  this  probability 
vector  is  easily  transformed  into  the  correct  one,  which  is  conditioned  on  customer 
Arc  +  1  not  having  completed  service  at  td.  All  that  is  necessary  is  to  zero  out  the 
probabilities  of  states  corresponding  to  customer  Nc+1  having  completed  service  and 
then  renormalize  so  that  the  probabilities  still  sum  to  1.0.  Now  the  cost  algorithm  can 
be  restarted  and  will  provide  correct  expected  waiting  times  for  customers  subsequent 
to  customer  Nc  4-  1.  The  optimization  algorithms  provided  in  this  chapter  obviously 
still  work,  since  the  dynamic  problem  is  transformed  by  this  artifice  into  a  static 
problem.  The  fact  that  this  new  static  process  starts  with  a  customer  in  service 
poses  no  difficulties. 

Consider  an  example.  Five  customers  each  have  Erlang(4)  services  with  mean 
of  1.0,  show  probabilities  of  1.0,  and  were  to  be  scheduled  into  [0,  5].  Cost  coefficients 
and  overtime  coefficient  are  1.0.  The  optimal  static  schedule  using  a  lattice  size  of 
0.05  was  [  0.00  1.05  2.30  3.55  4.65  ],  and  the  optimal  cost  was  1.885,  of  which 
1.51  came  from  the  overtime  and  waiting  times  of  customers  3,  4,  and  5.  Suppose 
the  first  customer  completed  service  before  the  second  arrived  and  that  at  td  =  2.0, 
the  second  is  still  in  service. 

The  first  task  is  to  determine  p(2.0),  given  the  knowledge  above.  At  the  start 
of  customer  2’s  service,  ts ,  all  the  probability  mass  is  concentrated  in  the  first  phase 
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of  customer  2: 


P(ts)  =  [  0.0  0.0  0.0  0.0  1.0  0.0  0.0  0.0  0.0  •■•] 

Since  customer  1  was  not  in  service  upon  customer  2’s  arrival,  ts  =  1.05,  the  arrival 
time  of  customer  2  in  the  optimal  static  schedule.  Application  of  Equation(7)  using 
the  transition  matrix  from  the  original  static  problem  and  A  =  0.95  yields 

0.000  0.000  0.000  0.000  0.022  0.085  0.162 

p(2.0)  =  0.205  0.194  0.148  0.148  0.094  0.051 

0.024  0.010  0.004  0.001  0.000  0.000  0.000  0.000 

(This  row  vector  is  displayed  in  three  rows  due  to  space  limitations.)  Since  it  is 

known  that  at  2.0,  customer  2  is  still  in  service,  all  states  other  than  5  through  8 
must  be  zeroed  and  p( 2.0)  must  be  renormalized,  yielding 

P(2.0)  =  [  0.0  0.0  0.0  0.0  0.047  0.179  0.341  0.432  0.0  •••] 

The  cost  algorithm  may  now  be  started  at  2.0  using  this  initial  probability  vector 
and  the  original  transition  matrix.  Application  of  the  fixed-lattice  algorithm  to  this 
transformed  problem  yields  the  optimal  dynamic  schedule 

[  0.00  1.05  2.45  3.65  4.75  ] 

which  is  slightly  later  than  the  original  static  schedule. 

The  cost  contributed  by  the  overtime  and  waiting  times  of  customers  3,  4,  and 
5  in  this  dynamic  schedule  is  1.72.  substantially  more  than  the  1.51  given  in  the 
static  schedule.  This  is  to  be  expected,  since  the  realization  of  customer  2  turned 
out  to  be  larger  than  was  expected  at  the  start  of  the  schedule.  More  appropriate 
is  to  compare  this  cost  to  that  obtained  if  the  static  schedule  had  been  kept,  given 
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customer  2  had  not  arrived  by  t=2.0.  This  cost  is  1.74,  so  improvement  under  the 
dynamic  schedule  was  slight  in  this  case.  In  situations  where  the  realizations  of 
services  are  far  from  their  means,  when  the  show  probability  is  appreciably  less  than 
1.0,  or  when  customers  are  removed  or  added  to  the  system,  the  dynamic  solution  is 
expected  to  be  a  substantial  improvement  over  retaining  the  static  solution. 

4-7  Variations  on  the  Scheduling  Problem 

The  problem  as  defined  may  not  seem  relevant  to  some  applications,  due  to 
certain  features  not  modeled.  However,  the  algorithm  is  robust  enough  to  tolerate 
certain  modifications.  As  one  example,  suppose  a  doctor  required  30  minutes  blocked 
out  of  his/her  morning  of  seeing  outpatients  in  order  to  perform  surgery  rounds. 
This  could  be  modeled  in  several  ways.  If  the  duration  and  start  of  this  period  were 
flexible,  one  way  to  model  it  would  be  as  an  added  patient  with  a  mean  service  time 
of  30  minutes.  If  the  duration  and  start  of  this  period  were  fixed,  and  it  was  deemed 
unacceptable  to  have  patients  wait  until  the  doctor  returned,  then  it  would  be  best 
to  break  the  morning  into  two  separate  schedules. 

Suppose  the  round  times  were  fixed  and  it  was  deemed  appropriate  for  patients 
to  wait  during  rounds,  even  if  the  doctor  had  to  leave  in  the  middle  of  the  consult. 
Then  one  could  take  advantage  of  the  fact  that  the  lattice  algorithms  work  even 
when  the  amount  of  time  between  schedule  slots  is  not  constant.  One  could  create  a 
gap  between  two  slots  precisely  equal  to  the  duration  of  rounds.  One  would  expect 
in  this  case  to  see  a  very  different  optimal  patient  sequence  and  schedule  than  in  the 
previous  treatments. 

The  service  protocol  so  far  has  been  first-come,  first-serve  (FCFS).  If  a  priority 
or  preemptive  protocol  obtain,  the  cost  function  is  still  submodular,  and  the  lattice 
algorithms  still  function.  Only  the  cost  calculation  would  become  more  involved. 
Likewise,  if  the  single  server  were  replaced  by  a  network  of  servers,  the  argument  for 
the  submodularity  of  the  cost  function  is  unchanged. 
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It  is  thus  seen  that  the  proposed  algorithms  are  rather  more  robust  than  pre¬ 
sented  and  are  flexible  enough  to  model  a  variety  of  features  encountered  in  appoint¬ 
ment  systems. 
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V.  Determining  the  Optimal  Sequence  of  Arrivals 

The  previous  chapter  addressed  the  problem  of  determining  an  optimal  sched¬ 
ule  for  a  given  sequence  of  arrivals.  The  task  left  is  to  determine  which  sequence  will 
be  optimal  for  a  given  problem.  For  small  problems,  this  can  be  accomplished  by 
exhaustive  enumeration  of  all  sequences,  calculation  of  their  optimal  schedules,  and 
selection  of  the  best  alternative.  However,  this  approach  quickly  becomes  unwieldy 
as  the  number  of  customer  classes  increases. 

This  chapter  examines  some  characteristics  of  optimal  sequences.  The  optimal 
sequence  is  seldom  one  in  which  customers  are  ordered  by  weighted  means,  weighted 
variances,  or  by  any  other  simple  measure.  In  fact,  the  optimal  sequence  frequently 
places  identical  customers  at  very  different  places  in  the  schedule.  It  will  be  seen 
that  this  surprising  behavior  is  inherent  even  in  simple  deterministic  problems.  There 
is  evidence  that  even  deterministic  appointment  sequencing  problems  are  NP-hard. 
Several  solution  approaches  to  the  stochastic  problem  are  considered.  These  are 
abandoned  in  favor  of  a  heuristic  approach,  which  will  be  shown  to  be  quite  effective 
in  obtaining  optimal  or  near-optimal  solutions  in  polynomial  time. 

5. 1  Deterministic  Examples 

Deterministic  problems  are  discussed  in  detail  in  Appendix  A.  Services  are 
assumed  to  be  deterministic  and  known,  no-shows  are  not  allowed,  and  the  unit  cost 
of  overtime  is  zero.  The  results  obtained  in  that  appendix  are  used  here  to  determine 
the  optimal  sequences  for  two  sample  deterministic  problems.  These  problems  will 
demonstrate  that  the  curious  features  of  stochastic  sequencing  problems  have  their 
root  not  in  their  stochastic  nature,  nor  in  the  inclusion  of  overtime,  but  in  the 
nature  of  the  family  of  deterministic  waiting  problems  at  the  heart  of  each  stochastic 
problem.  Consider  the  deterministic  example  defined  by  the  parameters  in  Table  5. 
Customers  are  labeled  A,  B,  and  C  to  avoid  confusion  of  indices  with  order. 
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Table  5.  Parameters  for  the  first  deterministic  example 


customer 

X 

c 

A 

3 

8 

B 

2 

3 

C 

1 

1 

When  Th  =  0,  the  optimal  schedule  is  trivially  [0  0  0].  Theorem  13  in  Appendix  A 
proves  that  WSPT1  yields  the  optimal  sequence,  ABC.  The  optimal  cost  is 

C(r)  —  cb(xa)  +  cc(xA  +  XB)  —  15 


Now  set  Th  =  1.  As  discussed  in  Appendix  A,  the  optimal  schedule  becomes 
[0  1  1],  regardless  of  sequence.  Retaining  the  WSPT  order  yields  a  cost  of 

C(T)  =  cb(xa  -  1)  +  cc(xA  “  1  +  XB)  =  10 

But  the  optimal  sequence  can  be  shown  by  the  methods  of  Appendix  A  to  be  CAB, 
for  a  cost  of 


C{t)  =  cA(x c  -  1)  +  cB(xc  -  1  +  XA)  =  9 


The  optimal  sequences  and  schedules  for  various  values  of  rh  are  shown  in  Table  6. 


Table  6.  Optimal  schedules  and  sequences  for  a  deterministic  example.  This  is  the 
problem  described  in  Table  5 


horizon 

sequence 

schedule 

cost 

0 

ABC 

0 

0 

0 

14 

1 

CAB 

0 

1 

1 

9 

2 

BAC 

0 

2 

2 

3 

3 

CBA 

0 

1 

3 

0 

BCA 

0 

2 

3 

0 

1  Weighted  shortest  processing  time  (WSPT)  is  the  ordering  of  customers  from  smallest  to  largest 
value  of  Xj/Cj-  A  process  is  called  WSPT  if  the  optimal  sequence  is  always  WSPT. 
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This  dependence  of  the  optimal  sequence  on  rh  is  remarkable,  but  even  more 
remarkable  is  the  propensity  for  the  optimal  sequence  to  place  identical  customers 
in  very  different  places.  Suppose  N  =  4,  \  =  [3, 1, 1, 1],  and  c  =  [4, 1, 1, 1],  Call  the 
first  customer  A  and  the  other  three  (identical)  customers  B.  The  optimal  results  are 
shown  in  Table  7.  This  tendency  for  the  optimal  sequence  to  separate  identical  cus¬ 
tomers  in  some  situations  appears  to  be  unique  among  single-machine  deterministic 
sequencing  problems  examined  in  the  literature. 


Table  7.  Optimal  schedules  and  sequences  for  another  deterministic  example.  Here, 


1]  and  c 

=  [4,1, 1,1] 

horizon 

sequence 

schedule 

cost 

0 

ABBB 

0 

0 

0 

0 

12 

1 

BABB 

0 

1 

1 

1 

7 

2 

BBAB 

0 

1 

2 

2 

3 

3 

BBBA 

0 

1 

2 

3 

0 

5.2  Stochastic  Examples 

A  service  time  distribution  can  be  considered  to  be  a  convex  combination  of 
a  number  of  deterministic  services.  It  should  therefore  be  no  surprise  that  optimal 
solutions  to  stochastic  problems  display  the  same  odd  behavior  that  was  shown  for 
deterministic  problems.  While  many  stochastic  scheduling  problems  have  simple 
optimal  ordering  rules  such  as  ordering  by  weighted  shortest  expected  processing 
time  (WSEPT)  or  by  weighted  shortest  variance  of  the  processing  time  (WSVPT), 
the  following  examples  will  help  disabuse  the  reader  of  any  remaining  notion  that 
some  simple  ordering  principle  exists  for  this  problem.  Optimal  sequences  in  this 
section  are  obtained  by  optimizing  the  schedule  for  every  possible  sequence  and 
choosing  the  best.  In  this  example,  suppose  that  there  are  two  classes  of  customers, 
each  with  show  probability  of  1.0,  and  that  each  waiting  cost  and  overtime  cost  is 
equally  weighted.  Customers  have  the  service  distributions  shown  in  Table  8. 
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Table  8.  Parameters  for  a  stochastic  example 


class 

service  distribution 

mean 

variance 

skewness 

A 

Erlang-2  with  u=2.0 

1.00 

0.50 

1.41 

B 

Cox-4  with  bl=b2=b3=1.0, 
ul=1.3,  and  u2=u3=u4=13.0 

1.00 

0.61 

1.91 

Suppose  there  are  two  customers  of  class  A  and  two  of  class  B  to  be  scheduled 
within  Th ■  The  sequences  in  Table  9  are  found  (by  enumeration)  to  be  optimal  over 
a  101-slot  lattice: 


Table  9.  Optimal  solutions  for  the  stochastic  example 


Th  <  0.46 

costs  of  all  sequences  are  equal 

0.46  <  Th  <  1.14 

BBAA 

1.14  <  Th  <  1.21 

ABAB  or  ABBA 

1.21  <  Th 

AABB 

This  example  shows  that,  even  when  the  cost  coefficients,  means,  and  show  rates  of 
all  customers  are  equal,  customers  cannot  merely  be  ordered  by  increasing  service 
variance,  as  has  been  hypothesized  in  previous  research  [162,  164]. 

The  reader  may  suspect  the  effect  of  higher  moments  is  causing  the  reversal  of 
customer  order  in  those  regions.  However,  the  coefficients  of  variation  for  the  two 
classes  are  0.71  and  0.78,  respectively,  so  the  third  moment  should  have  little  effect 
on  the  schedule,  and  thus  on  the  sequence,  of  arrivals. 

Consider  a  3-customer  problem  with  Erlang  services  and  the  parameters  listed 
in  Table  10. 

Figure  9  maps  the  optimal  sequence  of  this  problem  as  the  schedule  horizon 
and  the  overtime  unit  cost  are  varied.  The  sequence  CBA  orders  customers  by 
WSEPT,  while  ABC  orders  customers  by  WSVPT.  These  are  the  most  commonly 
encountered  optimal  schedules  over  the  space  depicted.  (The  notation  Cxx  indicates 
sequences  CAB  and  CBA  are  very  nearly  identical  in  cost,  and  that  one  of  them  is 
optimal.)  For  a  given  unit  cost  vector,  the  optimal  sequence  typically  changes  from 


84 


Table  10.  Parameters  used  in  Figure  9 


customer 

unit  cost 

Erlang  phases 

phase  rate 

mean 

variance 

A 

1.0 

1 

1.0 

1.00 

1.00 

B 

1.0 

2 

2.0 

1.00 

0.50 

C 

1.0 

2 

3.0 

0.67 

0.22 

Cxx  to  CBA  to  ABC  as  the  horizon  increases.  This  is  intuitively  appealing,  since 
one  would  expect  the  service  mean  to  have  more  cost  impact  than  the  variance  when 
the  schedule  is  very  constrained,  and  the  probability  that  customers  2  through  N 
must  each  wait  is  high.  Conversely,  when  the  schedule  is  much  less  constrained,  and 
customers  seldom  have  to  wait,  one  would  expect  the  service  mean  to  play  a  very 
small  part  in  sequence  preference.  That  the  transition  point  between  optimality 
of  the  WSEPT  and  WSVPT  sequences  is  at  a  higher  rh  when  c4  is  small  is  also 
intuitively  appealing,  since  variance  of  an  earlier  service  is  expected  to  have  more 
impact  than  mean  on  the  overtime. 

This  interface  between  optimality  of  the  mean-  and  variance-ordered  sequences 
is  often  marked  by  intermediate  optima,  as  can  be  seen  in  the  enlargement  at  the 
bottom  of  Figure  9.  Of  necessity,  points  near  this  border  depict  problems  in  which 
two  sequence  costs  are  very  close,  and  thus  small  numerical  instabilities  in  the  cost 
evaluation  process  should  be  more  evident.  However,  these  numerical  artifacts  con¬ 
tribute  only  a  small  part  to  the  jumble  of  optima  at  the  border.  The  cost  evaluation 
appears  to  be  accurate  to  at  least  0.001%,  and  most  of  the  cost  differences  between 
sequences  in  this  area  are  greater  than  0.01%.  While  the  shape  of  the  border  is  only 
roughly  accurate,  there  are  indeed  areas  where  sequences  CAB  and  BAC  are  optimal 
for  this  problem. 

Wang  attempted  to  prove  that  if  services  are  exponential  and  all  unit  waiting 
time  costs  are  equal,  then  customers  will  be  optimally  ordered  by  decreasing  service 
rate.  This  ordering  is  identical  to  both  WSEPT  and  WSVPT  in  this  situation.  No 
counterexamples  have  been  observed  when  the  overtime  point  is  at  zero,  as  it  is  in 
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log(horizon)  log(horizon) 


his  cost  formulation.  However,  if  tv  >  0,  there  are  cases  in  which  customers  are  not 
optimally  ordered  by  service  rate.  For  example,  if  there  are  three  customers  with 
exponential  services,  pA  =  100.0,  fiB  =  10.0,  and  pc  =  1.0,  show  probabilities  are  all 
1.0,  Th  =  rv  =  19,  c2  =  c3,  and  c4  =  100c3,  then  the  optimal  cost  for  arrival  sequence 
ABC  is  nearly  five  times  that  of  sequence  ACB. 


5.3  Mean  Residual  Life  Approach 

Use  of  a  mean  residual  life  approximation  to  waiting  time  may  lead  to  better 
intuitive  understanding  of  the  strange  optimal  sequences  obtained  in  some  situations. 
Consider  the  example  in  Table  8  for  the  specific  case  of  =  0.7.  The  optimal 
schedules  for  two  reasonable  candidate  sequences  for  optimum  are: 


BBAA 


0.00  0.47  0.70  0.70 


cost  =  7.601 


AABB 


0.00  0.50  0.70  0.70 


cost  =  7.611 


These  schedules  are  close  enough  that  one  may  neglect  the  difference  in  arrival  time 
of  customer  2  for  the  purpose  of  sequencing. 

The  choice  of  class  for  the  third  and  fourth  customers  has  little  impact  on 
optimal  cost,  since  these  customers  are  fixed  at  the  last  slot,  the  last  slot  is  the 
selected  onset  of  overtime,  and  the  classes  have  the  same  means.  It  is  reasonable 
that  the  choice  of  class  for  the  second  customer  will  have  the  greatest  impact  on  cost, 
since  its  arrival  time  is  close  to  those  of  two  actual  and  one  fictitious  customers. 
In  fact,  about  two-thirds  of  the  waiting  time  for  customer  3  is  contributed  solely 
by  customer  2,  regardless  of  the  class  of  the  first  customer.  The  mean  residual 
life  function,  L(t)  =  E[x  -  t\x  >  t],  has  a  close  relationship  with  waiting  time; 
the  expected  waiting  time  of  customer  3  contributed  solely  by  customer  2  can  be 
approximated  by  L(t3  -  r2)(l  -  F(t3  -  r2)).  The  first  plot  in  Figure  10  depicts 
the  CDFs  for  the  services  of  classes  A  and  B,  approximated  using  a  portion  of  the 
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evaluation  routine  in  Section  H.l,  and  the  data  for  the  other  plots  are  approximated 
from  these  CDFs.  Since  in  this  example  r3  -  r2  «  0.25,  it  can  be  seen  from  the  last 
plot  in  the  figure  that  La(t3  -  r2)(  1  -  Fa(t3  -  r2))  >  Tb(t3  -  r2)(l  -  Fb(t3  -  r2)), 
implying  that  the  waiting  time  contribution  of  customer  2  to  subsequent  customers 
will  be  greater  if  the  second  customer  is  of  class  A  than  if  it  is  of  class  B. 

The  selection  of  the  first  customer’s  class  can  be  made  with  a  similar  argument. 
Here,  T(r2-r1)(l-F(r2-r1))  is  precisely  E(W2).  Since  Ty-iy  «  0.45,  the  last  plot 
in  Figure  10  shows  that  La(t2  -  Ti)(l  -  Fa(t2  -  iy))  >  Lb(t2  -  Ti)(l  -  Fb(t2  -  ly)) 
again.  The  effect  of  the  first  customer  on  the  waiting  times  of  the  third,  fourth  and 
(fictitious)  fifth  customers  has  the  opposite  trend,  but  is  overshadowed  by  the  effect 
of  customer  2.  Hence  the  first  customer  should  be  of  class  B. 

5.4  Local  Search  Approach 

A  number  of  attempted  analytic  approaches  to  the  stochastic  problem  met 
with  failure.  The  mean  residual  life  approach  just  presented  is  at  best  a  way  of 
approximating  waiting  time.  Attempts  at  a  standard  swapping  approach  failed  be¬ 
cause  it  led  to  unmanageable  expressions,  even  when  the  customers  were  adjacent. 
Because  of  the  inherent  complexity  of  the  problem,  these  analytic  attempts  were 
abandoned  in  favor  of  heuristics.  As  will  be  seen,  a  simple  heuristic  can  obtain  the 
global  optimum  in  a  majority  of  problems  and  achieve  what  will  often  be  acceptable 
suboptimal  solutions  in  the  remaining  cases.  Such  heuristics  are  of  value  for  two  prin¬ 
cipal  reasons.  First,  there  are  no  successful  approaches  to  sequencing  appointments 
at  all  right  now,  so  any  method  is  an  improvement.  Second,  it  is  probable  that  the 
problem  is  strongly  NP-hard.  If  that  is  so,  any  analytical  approach  likely  would  be 
computationally  oppressive  on  reasonably-sized  problems,  while  the  heuristic  below 
is  polynomial. 

Several  local  search  algorithms  were  tested  on  4-  and  6-customer  sequencing 
problems.  One  entailed  performing  the  best  swap  of  adjacent  pairs  at  each  iteration.. 
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Another  entailed  performing  iterations  that  consisted  of  inserting  each  customer  in 
turn  at  the  spot  that  resulted  in  the  lowest  possible  cost.  These  candidates  were 
suggested  by  Pinedo  [130:  pl48].  The  best  performance  by  far,  both  in  terms  of 
speed  and  accuracy,  was  obtained  by  the  following  algorithm: 

Sequencing  Algorithm 

1.  Select  an  initial  sequence,  II.  Determine  the  cost  of  the  optimal  schedule  for 

n. 

2.  Perform  each  of  the  (^)  possible  pairwise  swaps  on  n,  and  determine  the  cost 
of  the  optimal  schedule  for  each  resulting  sequence. 

3.  If  the  best  swap  in  step  2  improved  on  II,  replace  II  and  go  to  step  2.  Otherwise, 
accept  II  as  the  optimal  sequence. 

On  its  face,  this  is  a  poor  approach  to  sequencing.  Consider  a  sequencing 
problem  with  no  inherent  structure;  each  of  the  (N\)\  orderings  by  cost  of  the  pos¬ 
sible  sequences  are  equally  likely.  For  A'  =  3,  it  can  be  shown  analytically  that  the 
algorithm  is  expected  to  find  the  global  optimum  11/12  of  the  time.  The  other  1/12 
of  the  time,  the  initial  choice  of  n  is  a  local  optimum,  and  the.  algorithm  immedi¬ 
ately  fails  to  make  progress.  For  other  problem  sizes,  the  likelihood  of  finding  the 
global  optimum  for  an  unstructured  problem  can  be  estimated  using  a  Monte  Carlo 
approach.  Table  11  shows  that  the  algorithm  should  be  spectacularly  unsuccessful 
even  on  small  problems,  unless  there  is  some  underlying  regularity  to  the  cost  of 
the  sequences  that  favors  the  sequencing  algorithm.  Further,  the  algorithm  fails 
regularly  on  deterministic  problems,  often  obtaining  costs  over  50%  higher  than  the 
actual  optimum.  Nevertheless,  the  next  two  sections  quantify  the  success  of  this  al¬ 
gorithm  on  a  set  of  randomly  generated  stochastic  problems,  and  Appendix  E  shows 
its  effectiveness  on  an  actual  problem. 
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Table  11.  Success  rate  for  the  sequencing  algorithm  on  unstructured  problems. 

Probability  of  the  sequencing  algorithm  successfully  finding  the  optimal 
sequence  for  randomly  generated  problems.  Each  of  the  possible  (N!)! 
orderings  of  the  sequences  by  cost  is  equally  likely.  The  average  number 
of  iterations  (passes  through  step  2  of  the  algorithm)  is  also  tabulated. 
All  but  N  =  3  are  based  on  10,000  runs  of  a  Monte  Carlo  simulation. 


N 

success  rate 

average  #  passes 

3 

11/12 

2.10 

4 

0.528 

2.30 

5 

0.192 

2.46 

6 

0.048 

2.90 

5.5  Experiment  Design 

To  test  the  sequencing  algorithm,  a  number  of  test  problems  were  selected 
across  the  range  of  parameters  that  are  expected  to  be  encountered  in  schedul¬ 
ing/sequencing  problems.  For  each  problem,  the  global  optimum  was  approximated 
using  the  algorithm  and  actually  determined  by  optimizing  the  schedule  for  each  pos¬ 
sible  sequence.  Coxian  parameters  were  determined  using  the  methods  developed  in 
Appendix  F.  The  following  ranges  and  characteristics  were  selected. 


-  Number  of  customers.  Because  exhaustive  enumeration  is  required  to  check 
the  algorithm,  the  time  required  to  test  a  problem  becomes  prohibitively  long 
for  even  small  values  of  N.  For  N  =  6,  for  example,  a  single  design  point 
requires  15  minutes  on  a  133  MHz  Pentium.  For  N  =  10,  the  run  time  is 
estimated  at  over  50  days.  From  Table  11,  the  algorithm  for  general  problems 
should  only  be  successful  about  half  the  time  for  N  =  4  and  about  5%  of  the 
time  for  iV  =  6.  Any  effectiveness  on  this  problem  should  show  up  with  these 
values  of  N  if  it  exists  and  should  carry  over  to  larger  values  of  N. 

-  Number  of  schedule  slots.  From  preliminary  tests,  it  seems  that  the  value  of 
K  has  no  measurable  effect  on  the  efficacy  of  the  algorithm.  It  was  held  fixed 
at  11  for  the  4-customer  tests  and  at  21  for  the  6-customer  tests.  These  are 
realistic  values. 
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Overtime  point,  tv.  The  overtime  point  was  not  considered  a  large  factor  in 
the  effectiveness  of  the  algorithm  and  was  fixed  at  the  schedule  horizon,  which 
is  realistic  for  many  problems. 

Service  distribution  means.  The  mean  of  each  customer’s  service  was  selected 
independently  from  a  lognormal  distribution.  The  log  of  the  service  mean  had 
a  mean  of  zero  and  a  variance  of  0.1,  1.0,  or  10.0.  The  variance  of  the  logs  of 
the  means  will  be  denoted  as  VARMEAN  in  the  following  discussion.  These 
values  were  selected  in  order  to  test  both  cases  in  which  means  for  a  series  of 
customers  were  closely  clustered  and  those  in  which  some  means  were  outliers. 

Number  of  phases.  The  service  distributions  were  limited  to  those  represented 
by  at  most  4  Coxian  phases.  Figure  32  (with  r  —  2)  shows  that  this  is  a  rea¬ 
sonable  compromise  between  the  goals  of  maximizing  the  reachable  3-moment 
space  and  keeping  computation  time  reasonably  small. 

Service  coefficients  of  variance.  Each  c  was  selected  independently  from  the 
set  [0.5,  0.8,  1.0,  1.2,  and  1.5].  The  low  value  is  limited  by  the  choice  of  only 
four  phases.  The  high  value  is  an  extreme  for  realistic  problems. 

Third  service  moment.  Other  research  indicated  that  when  c  <  1,  the  third 
moment  was  not  critical  to  the  cost  for  similar  queueing  problems  [3],  and 
preliminary  research  indicated  no  dependence  of  optimal  sequence  on  third 
moment  in  that  region.  For  those  reasons,  the  third  moment  was  left  uncon¬ 
trolled  when  c  <  1.  When  c  was  chosen  as  1.2,  the  third  noncentral  moment 
was  chosen  as  either  6.05  times  the  cube  of  the  first  moment  or  30.0  times 
the  cube  of  the  first  moment,  with  equal  probability.  When  c  was  chosen  as 
1.5,  the  third  noncentral  moment  was  chosen  as  either  7.8  times  the  cube  of 
the  first  moment  or  30.0  times  the  cube  of  the  first  moment.  The  low  choices 
represent  the  lowest  possible  values,  given  four  Coxian  phases  (cf.  Section  F). 
The  high  choices  represent  reasonably  high  values. 
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-  Customer  unit  costs.  These  were  chosen  independently  from  a  lognormal  distri¬ 
bution,  with  the  log  of  the  cost  having  a  mean  of  zero  and  variance  of  0.5.  This 
ensured  a  mix  of  problems  with  close  unit  costs  with  some  that  had  outliers. 

-  Starting  point.  For  each  trial,  the  algorithm  was  started  from  one  of  three 
sequences  (noted  by  START  in  the  following  discussion): 

-  Selected  at  random 

-  Ordered  by  weighted  shortest  processing  time  (WSEPT) 

-  Ordered  by  weighted  variance  of  the  processing  time  (WSVPT) 

For  each  of  the  parameter  sets  chosen  from  the  above  considerations,  a  set  of  100 
experiment  design  points  were  selected  by  varying  the  server  overtime  unit  cost  and 
the  schedule  horizon  regularly.  The  overtime  unit  cost  was  tested  at  the  values  [0.1, 
1.0,  10.0,  100.0,  1000.0].  For  each  of  these  values,  rh  was  tested  at  20  points  spaced 
uniformly  between  0.1  and  2.0.  These  two  parameters  were  selected  for  more  regular 
testing  because  preliminary  experiments  indicated  they  had  a  strong  effect  on  the 
optimal  sequence. 

5.6  Experiment  Results 

The  results  of  10,000  4-customer  trials  are  summarized  in  Table  12.  To  check 
the  relative  effectiveness  of  the  algorithm  over  different  groupings  of  customer  means, 
the  variance  of  the  logs  of  the  means  were  generated  randomly  in  each  trial  from  one 
of  three  standardized  lognormal  populations,  as  discussed  in  the  preceding  section: 
4200  trials  at  VARMEAN=  0.1,  2477  trials  at  VARMEAN=T.O,  and  3323  trials  at 
VARMEAN=10.0.  The  efficacy  of  the  algorithm  was  measured  in  terms  of: 

-  The  percentage  of  time  the  algorithm  returned  a  sequence  whose  cost  was  the 
global  optimum  or  was  within  0.001%  of  the  cost  of  the  global  optimum  (% 
success).  This  acceptance  of  near-optima  avoids  potential  problems  with  small 
floating-point  errors  in  the  cost  computation  routines. 
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-  The  average  percent  error  of  the  conjectured  optimal  cost  from  the  actual 
optimal  cost  (MAPE). 

-  The  average  error  divided  by  the  average  cost  (%  error).  This  measure  avoids 
heavily  weighting  percent  errors  that  are  high  due  to  the  cost  being  quite  low. 

-  The  maximum  percent  error  of  the  conjectured  optimal  cost  from  the  actual 
optimal  cost  (max  %  error). 

-  The  average  number  of  iterations  (passes  through  step  2  of  the  algorithm) 
required  to  reach  the  conjectured  optimal  cost  (ave  iter). 

-  The  maximum  number  of  iterations  required  to  reach  the  conjectured  optimal 
cost  (max  iter). 


VAR- 

MEAN 

START 

% 

success 

MAPE 

% 

error 

max 

%  error 

ave 

iter 

max 

iter 

0.1 

WSEPT 

96.1% 

0.076% 

0.0023% 

8.4% 

2.45 

9 

0.1 

WSVPT 

95.5% 

0.092% 

0.0016% 

15.6% 

2.71 

9 

0.1 

random 

95.8% 

0.081% 

0.0019% 

12.7% 

3.79 

10 

HXH 

■slfcteM 

0.080% 

0.0066% 

14.2% 

2.07 

7 

0.109% 

0.0050% 

23.9% 

1.82 

8 

1.0 

random 

91.2% 

0.131% 

0.0080% 

16.5% 

4.11 

11 

10.0 

WSEPT 

98.9% 

0.009% 

t- 

1 

o 

T— i 

CO 

2.6% 

1.34 

5 

10.0 

WSVPT 

98.6% 

0.010% 

0.0011% 

2.0% 

1.29 

4 

10.0 

random 

82.1% 

7.790% 

0.1754% 

638.8% 

4.00 

13 

Goodness-of-fit  tests  (Chi-square  and  Kolmogorov-Smirnoff)  suggest  that  the 
observed  errors  and  percent  errors  obtained  from  each  starting  point  are  exponen¬ 
tially  distributed,  as  are  the  numbers  of  iterations  required. 

There  were  a  number  of  spectacular  failures  at  VARMEAN=10.0  for  starting 
sequences  chosen  at  random.  Because  a  random  starting  point  is  ineffective  for  some 
problems,  it  was  not  utilized  in  subsequent  trials.  While  the  results  starting  at 
WSEPT  and  WSVPT  appear  quite  similar,  WSEPT  yields  a  smaller  mean  average 
percent  error  in  each  set  of  trials.  Among  the  cases  in  which  the  candidate  optima 
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found  by  starting  at  WSEPT  and  WSVPT  differed,  46%  of  the  WSEPT  starts 
were  better  than  the  WSVPT  starts  at  VARMEAN=1.0,  compared  to  100%  for 
VARMEAN=10.0  and  74%  for  VARMEAN=0.1.  This  would  lead  one  to  prefer  a 
WSEPT  start,  and  this  was  the  course  taken  in  the  remainder  of  this  effort. 

For  each  of  the  above  sets  of  trials,  the  magnitude  of  errors  appeared  to  be 
related  to  the  size  of  the  overtime  cost  coefficient.  When  the  domain  was  restricted 
to  those  trials  that  did  not  locate  the  optimum,  the  correlations  between  the  percent 
error  and  log(c5/  £?=i  u)  fell  between  -0.37  and  -0.57.  A  larger  relative  value  of  the 
overtime  cost  coefficient  seems  to  exert  a  stabilizing  influence  on  the  algorithm.  This 
may  have  to  do  with  the  fact  that  the  larger  coefficient  leads  to  optimal  schedules 
that  are  earlier,  and  thus  vary  less  when  the  sequence  is  modified.  Although  de¬ 
creasing  rh  also  shifts  the  optimal  schedule  earlier,  an  opposite  effect  was  observed; 
the  correlation  between  rh/YlU l  E\Xi]  and  error  ranged  from  0.25  to  0.58. 

A  set  of  6-customer  experiments  were  run  with  WSEPT  as  the  starting  point. 
VARMEAN  was  set  to  1.0,  since  this  intermediate  value  generated  the  poorest  results 
in  the  4-customer  experiment.  Table  13  compares  the  results  with  the  4-customer 
results  found  above.  While  the  algorithm  clearly  does  not  perform  as  well  with  6 
customers,  the  results  are  still  acceptable  for  most  applications. 

Unfortunately,  validation  of  the  sequencing  algorithm  requires  evaluation  of 
the  optimal  schedules  for  each  possible  sequence.  Because  of  the  roughly  factorial 
dependence  of  run  time  on  number  of  customers,  this  validation  is  prohibitive  in 
terms  of  run  time  for  all  but  the  smallest  problems.  The  evaluation  of  the  algorithm’s 
performance  for  higher  numbers  of  customers  is  relegated  to  future  research. 


Table  13.  Comparison  of  four-  and 


cust¬ 

omers 

# 

trials 

% 

success 

MAPE 

% 

error 

max  % 

error 

ave 

iter 

max 

iter 

4 

2477 

91.8% 

0.08% 

0.0066% 

14.2% 

2.07 

7 

6 

1300 

85.1% 

0.22% 

0.0101% 

13.0% 

4.62 

13 

six-customer  experiment  results. 
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5. 7  Summary 

In  summary,  the  optimal  sequence  of  arrivals  to  an  appointment  system  with 
iid  stochastic  services  was  seen  to  be  highly  dependent  on  a  number  of  variables, 
including  schedule  horizon  and  relative  sizes  of  unit  costs.  These  dependencies  are 
unpredictable  without  extensive  calculation;  the  optimal  arrival  time  of  a  particular 
customer  may  occur  at  the  beginning  of  the  sequence  for  one  set  of  parameters  and 
at  the  end  for  a  slightly  changed  set.  Further,  identical  customers  typically  are  not 
even  adjacent  in  the  optimal  sequence,  a  rare  situation  in  sequencing  problems.  The 
seemingly  erratic  behavior  of  the  optimal  sequence  is  evinced  even  when  services  are 
deterministic. 

While  a  number  of  analytical  approaches  to  sequencing  arrivals  to  an  appoint¬ 
ment  system  with  iid  stochastic  service  times  were  unsuccessful,  an  effective  heuristic 
was  determined.  Tests  of  up  to  six  customers  succeeded  in  finding  the  global  opti¬ 
mum  over  85%  of  the  time  and  maximum  error  over  thousands  of  tests  was  14%. 

Despite  the  long  history  of  appointment  system  analysis,  the  fact  that  certain 
sequences  resulted  in  lower  appointment  system  costs  was  not  recognized  in  the 
literature  until  this  decade  [164].  Up  to  now,  optimal  sequencing  has  addressed 
extremely  limited  cases,  and  attempts  to  solve  those  have  been  unsuccessful  [162, 
164].  This  dissertation  is  the  first  recognition  of  the  odd  behavior  of  the  optimal 
sequences,  the  first  observation  that  the  optimal  sequencing  problem  has  its  roots  in 
the  deterministic  analogue,  and  the  first  successful  solution  approach  to  the  sequence 
optimization  of  stochastic  service  systems. 
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VI.  Conclusion 


This  chapter  summarizes  the  research  performed  in  this  dissertation,  includ¬ 
ing  the  appendices  following.  It  highlights  its  unique  contributions  and  proposes 
directions  for  future  research. 

6. 1  Contributions 

This  research  has  contributed  materially  in  a  number  of  ways  to  the  goal  of 
optimizing  arrival  times  to  an  appointment  system.  First,  the  cost  function  used  is 
much  broader  than  those  used  in  the  past.  It  incorporates  the  effects  of  no-shows 
and  lateness.  No  past  effort  has  considered  the  effects  of  no-shows.  Lateness  was 
only  incorporated  in  cost  evaluation  in  the  case  of  identical  service  distributions  and 
identical  inter  arrival  times.  The  cost  function  used  here  employs  a  generalization 
of  overtime  that  unifies  the  individual  measures  of  server  availability,  idle  time,  and 
overtime  used  in  other  works. 

Other  efforts  have  employed  an  embedded  Markov  chain  approach  to  appoint¬ 
ment  system  cost  evaluation,  as  this  work  did.  By  doing  so,  distinct  phase-type  ser¬ 
vice  distributions  are  admitted,  which  can  approximate  general  service  distributions 
to  arbitrary  accuracy.  However,  those  efforts  relied  inherently  on  Jordan  decomposi¬ 
tion  for  matrix  exponentiation,  a  procedure  that  was  shown  here  to  err  substantially 
in  floating-point  implementations  when  eigenvalues  are  nearly  confluent.  No  such 
difficulty  was  encountered  in  this  approach. 

Second,  an  approach  to  optimization  of  the  schedule  of  arrivals  over  a  lattice 
in  this  work  was  developed.  This  method  relied  on  the  piecewise  convexity  and 
submodularity  of  the  cost  function  when  lateness  is  not  allowed.  The  only  other 
efforts  that  addressed  optimization  of  the  scheduled  arrival  times  over  a  lattice  were 
limited  to  identical  Erlang  service  times.  Further,  these  methods  were  shown  to  be 
less  efficient  than  the  one  proposed. 
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Third,  optimal  sequencing  of  appointments  has  only  been  considered  for  sys¬ 
tems  with  two  customers  or  for  systems  with  exponential  services  and  equal  unit 
waiting  costs.  Here,  a  heuristic  approach  to  sequencing  was  introduced  that  ad¬ 
mits  arbitrarily  accurate  approximations  to  distinct  general  distributions,  distinct 
no-shows,  and  arbitrary  unit  costs. 

Last,  a  new  approach  to  matching  moments  using  Coxian  distributions  was 
introduced.  It  was  proven  that  a  Coxian  phase  appended  to  an  Erlang  distribution 
can  match  the  first  three  moments  of  any  given  distribution  with  positive  support, 
requiring  only  four  parameters  to  be  determined.  This  parsimonious  representation 
could  be  of  use  in  numerous  other  models  that  employ  phase-type  distributions. 

6.2  Future  Research 

There  is  a  great  deal  more  research  to  be  done  in  the  area  of  appointment  sys¬ 
tem  optimization.  The  proposed  approach  to  sequencing  is  heuristic  and  does  not 
always  lead  to  the  global  optimum;  perhaps  a  better  heuristic  approach  might  be 
found.  Also,  the  proposed  solution  to  the  problem  of  optimizing  the  set  of  combina¬ 
tions  can  be  improved  and  requires  further  attention  before  actual  implementation. 
In  particular,  it  assumes  the  same  number  of  customers  are  scheduled  each  day.  If 
this  restriction  were  relaxed,  some  addition  to  the  cost  function  would  be  necessary 
to  account  for  the  value  of  service,  which  previously  was  a  constant. 

While  the  appointment  system  model  considered  here  is  more  general  than 
those  considered  in  other  research,  there  are  still  assumptions  that  are  unrealistic 
and  that  future  research  should  attempt  to  remove.  For  instance,  a  scheme  was 
developed  for  approximating  schedule  cost  if  customer  lateness  is  allowed.  However, 
modeling  lateness  destroys  convexity  and  submodularity,  both  of  which  were  essential 
to  the  arguments  presented  in  Chapter  IV,  so  this  feature  was  not  incorporated 
into  the  optimization  algorithms.  Very  little  has  been  done  in  the  past  even  in 
terms  of  evaluating  expected  waiting  time  under  lateness,  and  no  research  has  been 
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done  to  show  what  the  effects  of  lateness  are  on  the  optimal  sequence,  schedule,  or 
cost.  These  are  important  questions  that  this  dissertation  may  provide  an  initial 
framework  for  addressing. 

A  single-server  queue  was  assumed.  If  the  server  is  replaced  by  a  network  of 
servers,  submodularity  still  holds  (by  the  same  proof  as  that  of  Theorem  2),  so  the 
lattice  scheduling  algorithms  still  are  effective.  A  cost  algorithm  for  this  situation 
can  be  generated  by  expanding  the  state  space  of  the  continuous  imbedded  Markov 
chain.  It  is  unclear  how  effective  the  sequencing  algorithm  would  be. 

Other  features,  such  as  balking,  preemption  by  unscheduled  customers,  server 
failures,  rescheduling  of  no-shows,  and  dependence  of  service  time  on  factors  such  as 
current  queue  length,  are  important  considerations  in  some  applications.  However, 
they  have  not  been  incorporated  into  the  current  model. 

It  may  be  possible  to  expand  the  model  to  consider  more  general  cost  functions 
as  well.  Currently,  the  cost  is  a  linear  combination  of  server  overtime  and  each 
customer  waiting  time.  One  might  add  another  overtime  term  with  a  new  overtime 
point  in  order  to  increase  the  cost  of  overtime  beyond  a  certain  time. 

The  heuristic  sequencing  algorithm  might  be  more  effective  if  incorporated  into 
a  general  search  algorithm,  such  as  a  Tabu  search.  This  could  allow  rapid  location 
and  rejection  of  local  minima.  Wang  first  proposed  such  an  approach  in  a  recent 
conversation  with  the  author. 

The  deterministic  sequencing  problem  was  formulated  but  not  solved.  While 
this  does  not  seem  to  be  a  good  model  of  any  practical  situation,  its  solution  might 
lead  to  some  insight  into  the  stochastic  sequencing  problem.  In  particular,  it  might 
help  explain  why  the  sequencing  algorithm  works  well  on  stochastic  problems  but 
often  performs  poorly  on  deterministic  ones.  This  could  lead  to  prediction  of  which 
stochastic  problems  the  sequencing  algorithm  will  fail  to  solve  satisfactorily. 
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A  reverse  course  of  inquiry  may  be  profitable  as  well.  Could  it  be  that  trans¬ 
forming  a  deterministic  scheduling  problem  into  a  stochastic  one  would  improve  the 
performance  of  the  sequencing  algorithm  and  yet  allow  the  optimal  sequence  of  the 
deterministic  problem  to  be  determined?  Such  “stochastization”  approaches  have 
proved  helpful  on  problems  such  as  determining  the  three-dimensional  geometry  of 
a  set  of  objects,  given  distance  information  [110].  The  main  advantage  to  such  an 
approach  is  a  smoothing  of  local  optima,  which  is  precisely  the  problem  encountered 
in  the  deterministic  sequencing  problem.  While  the  solution  of  the  deterministic 
sequencing  problem  is  not  of  tremendous  import,  it  could  lead  to  effective  solution 
methods  of  its  close  relative,  the  1||  YL^jTj  problem. 

As  noted  here  and  by  Topkis,  there  is  a  wealth  of  functions  whose  submodular 
structure  can  be  exploited  [155].  Application  of  the  fixed-lattice  algorithm  to  other 
submodular  lattice  problems  might  be  more  effective  than  current  approaches. 

However,  as  the  preliminary  study  in  Appendix  E  made  clear,  no  further  ana¬ 
lytical  research  need  be  done  in  order  to  use  this  work  to  achieve  vast  improvements 
over  the  current  state  of  appointment  systems.  Currently,  no  practitioner  has  at¬ 
tended  to  the  question  of  optimal  sequence  of  arrivals,  and  few  pay  attention  to 
any  quantity  but  the  service  mean  when  determining  a  schedule  of  arrivals.  Every 
appointment  system  in  which  the  waiting  time  of  customers  has  value,  whether  it  be 
patients  arriving  to  a  doctor’s  office,  parts  arriving  to  a  just-in-time  system,  cargo 
planes  arriving  to  an  unloading  facility,  or  fighter  planes  arriving  to  a  practice  range, 
can  benefit  from  the  approaches  offered  in  this  dissertation.  Of  all  the  contributions 
future  researchers  may  make,  possibly  one  of  the  most  important  is  the  practical 
demonstration  of  the  efficacy  of  this  work. 

6. 3  Summary 

This  concludes  one  researcher’s  attempts  both  to  solve  the  46-year-old  problem 
of  optimally  scheduling  and  sequencing  arrivals  to  an  appointment  system  and  to 
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extend  the  current  set  of  tools  available  to  the  analyst.  The  first  goal  -  that  of 
establishing  effective  approaches  to  the  optimization  problems  -  was  achieved.  For 
the  first  time,  a  heuristic  algorithm  has  been  presented  that  approximates  the  optimal 
customer  arrival  time  sequence,  and  an  effective  new  analytic  algorithm  has  been 
presented  for  scheduling  the  arrival  times  of  these  customers.  This  has  opened  the 
way  for  other  researchers  to  pursue  higher-level  problems,  such  as  the  optimization 
of  customer  combinations. 

The  second  goal  of  developing  new  analytical  tools  also  was  achieved.  The 
scheduling  algorithm  has  been  shown  to  apply  to  the  optimization  of  submodular 
functions,  a  class  with  numerous  practical  applications.  Common  approaches  to 
matrix  exponentiation  have  been  examined  for  numerical  stability  and  some  new 
results  obtained.  A  parsimonious  approach  to  selecting  a  Coxian  distribution  with 
a  given  set  of  first  three  moments  has  been  presented. 

The  issue  of  appointment  system  optimization  is  far  from  closed.  The  pre¬ 
vious  section  highlighted  a  number  of  important  problems  that  have  not  yet  been 
addressed.  It  is  hoped  that  this  research  on  optimizing  appointment  systems  will  lay 
a  foundation  for  future  work  and  that  the  tools  developed  here  will  prove  of  use  to 

other  researchers. 
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Appendix  A.  Deterministic  Analogue 

Important  insights  into  the  nature  of  the  scheduling  and  sequencing  of  arrivals 
to  an  appointment  system  can  be  gained  by  considering  a  simple  deterministic  ap¬ 
pointment  system  analogue.  These  insights  are  integral  to  the  arguments  in  Chapter 
V,  but  the  supporting  arguments  are  contained  here  to  improve  readability.  This 
appendix  will  prove  the  optimal  schedule  for  a  given  sequence  of  arrivals,  given  de¬ 
terministic  services.  The  determination  of  the  optimal  sequence  will  be  shown  to  be 
a  nonlinear  knapsack  problem,  and  thus  probably  NP-hard.  Some  special  cases  are 
proved  to  be  easily  solvable. 

Let  service  times  be  deterministic  and  known.  No-shows  are  not  allowed,  and 
customers  are  punctual.  N  customers  are  to  be  assigned  arrival  times  between  0 
and  Th  (as  well  as  an  arrival  sequence)  such  that  the  weighted  sum  of  the  customer 
waiting  times  is  minimized: 

N 

Minimize:  C(r)  =  ^CtWi(r)  suc^  t*iat  Ti  e  [0>r/»]  Vi  (30) 

t=l 

The  overtime  term  is  omitted,  which  simplifies  the  following  treatment  while  re¬ 
taining  the  salient  features  of  the  solution.  For  simplicity,  it  will  also  be  assumed 
that  arrival  times  are  restricted  to  a  lattice  with  regular  intervals  of  size  A  and  that 
customers  service  times  are  strictly  positive  integral  multiples  of  A. 

This  deterministic  appointment  system  is  not  very  realistic.  In  most  conceiv¬ 
able  circumstances,  the  horizon  would  be  modified  to  equal  the  sum  of  service  times, 
in  which  case  optimal  schedules  and  sequences  can  be  obtained  trivially.  However, 
in  the  context  of  the  larger  stochastic  problem,  in  which  the  service  time  sum  is  not 
known  in  advance,  this  system  is  quite  relevant.  Most  importantly,  the  seemingly 
chaotic  nature  of  the  optimal  sequence  can  be  seen  to  have  its  roots  in  this  simple 
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problem.  In  the  following  treatment,  in  order  to  avoid  the  trivial  solution,  assume 
the  horizon  is  less  than  the  service  sum,  forcing  some  customers  to  wait  for  service. 

For  a  fixed  sequence,  the  optimal  schedule  of  arrivals  can  be  constructed  easily. 
This  will  be  proved  recursively  with  the  help  of  two  propositions. 


Theorem  13  In  the  deterministic  problem  with  Th  =  0,  any  optimal  sequence  is 
WSPT. 

Proof:  Since  rh  =  0,  then  the  schedule  is  fixed  at  [0  0  •  •  •  0],  and  Wj  =  Ei=i  Xi- 
This  sequencing  problem  is  one  of  minimizing  weighted  completion  time  under  si¬ 
multaneous  arrival  times,  which  can  be  shown  by  a  simple  swapping  argument  to  be 
WSPT  [130:  Theorem  3.11].  - 

Lemma  14  Given  deterministic  arrival  times  and  Th  <  x i>  the  optimal  schedule  for 
a  given  sequence  places  the  first  customer  at  0,  and  the  rest  at  Th- 

Proof:  From  the  nontriviality  condition  above,  at  least  one  customer  has  a 
nonzero  waiting  time.  The  optimal  schedule  has  no  idle  time;  if  it  did,  then  all 
subsequent  arrival  times  could  be  moved  earlier,  and  the  time  gained  could  be  used 
to  extend  the  interarrival  time  prior  to  the  next  customer.  By  reference  to  Figure  1, 
a  simple  bookkeeping  argument  shows  that 


Wfir)  =  max 


j- 1 

o,  J2  Xi  -  Tj  +  Ti 
2  =  1 


Vj€  [2  ,N] 


(31) 


It  is  not  possible  to  change  the  summation,  since  the  order  is  fixed.  Each  Wj  is 
minimized  by  minimizing  T\  -  i.e.,  setting  it  to  zero.  Each  Wj  is  linear  in  Tj  for 
Th  <  Ti,  so  Wj  is  minimized  by  maximizing  t3  -  i.e.,  setting  it  to  r/,.  Thus,  the 
lowest  cost  is  achieved  when  the  first  customer  arrives  at  zero  and  all  others  at  Th-  " 


Lemma  15  For  a  fixed  sequence  and  deterministic  arrival  times,  the  optimal  cost 
for  a  problem  with  Th  <  Xi  *s  identical  to  the  one  in  which  Th  —  0  and  the  service  of 
the  first  customer  is  reduced  by  Th- 
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Proof'.  Since  Th  ^  \ i ,  each  customer  besides  the  first  is  optimally  scheduled 
at  rh,  and  a  reduction  of  rh  results  in  a  commensurate  reduction  of  each  customer’s 
arrival  time  except  the  first.  By  inspection  of  Equation  (31),  it  is  clear  that  the  same 
Wj(T )  is  obtained  when  both  Tj  and  Xi  are  reduced  the  same  amount,  so  the  optimal 
cost  is  unchanged.  “ 


Theorem  16  In  a  deterministic  problem, 
places  each  customer  at 

Tj  =  min 


the  optimal  schedule  for  a  given  sequence 


j- 1 


Th,  ^2  Xi 

i~  1 


(32) 


Proof:  Consider  the  following  recursive  procedure,  initializing  Nh  =  1:  Set 
Th  —  o,  so  that  each  arrival  time  is  at  zero.  Increase  rh  by  either  the  service  time  of 
the  initial  customer  in  this  transformed  problem  or  by  the  amount  needed  to  reach 
the  desired  value  of  rh,  whichever  is  smaller.  In  the  former  case,  the  first  interarrival 
time  is  equal  to  the  first  service  time,  so  the  second  customer  experiences  no  waiting 
time  and  no  idle  time.  The  optimal  sequence  schedules  a  single  customer  at  zero  and 
the  remainder  at  rh.  If  rh  equals  its  desired  value,  stop.  Otherwise,  use  Lemma  15 
to  transform  the  problem  again  by  reducing  xi  an<^  Th  to  zero,  then  increment  Nh 
and  repeat  the  procedure. 

When  the  procedure  is  completed  (guaranteed  in  at  most  N  -  1  iterations), 
each  interarrival  time  between  customers  i  and  i  +  1  is  equal  to  x»>  when  i  <  Nh- 
Each  interarrival  time  when  i  >  Nh  is  zero,  so  the  proof  is  complete. 

Now  that  the  optimal  schedule  for  a  given  sequence  is  proved,  the  optimal 
sequence  can  be  considered.  Theorem  13  proved  the  optimal  sequence  for  Th  0. 
Now  consider  increasing  tv 


Theorem  17  In  the  deterministic  problem  with  Th  <  max[xi,  X2,  •  •  • ,  Xn],  the  opti¬ 
mal  sequence  is  one  in  which  the  first  customer  has  smallest  (tj  Th)/cj,  and  the 
rest  are  WSPT. 
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Proof:  Let  £  be  the  current  value  of  the  horizon,  for  convenience  in  the  subse¬ 
quent  transformations.  By  Theorem  16,  the  optimal  schedule  will  have  one  customer 
at  zero  and  the  rest  at  f .  By  Lemma  15,  the  cost  for  a  particular  sequence  is  equiv¬ 
alent  to  that  obtained  when  rh  =  0  and  the  first  customer’s  service  is  reduced  by 
f.  Assuming  the  first  customer  in  the  optimal  sequence  can  be  determined,  this 
transforms  the  problem  to  one  with  rh  =  0,  and  the  optimal  sequence  is  WSPT  by 
Theorem  13. 

The  problem  now  is  to  prove  which  should  be  the  first  customer,  the  one  to  be 
transformed  by  reduction  of  service  time.  Consider  the  current  transformed  problem 
with  Tfc,  and  suppose  all  service  times  were  reduced  by  f ,  rather  than  just  the  first. 
By  Theorem  13,  the  first  in  the  optimal  schedule  of  this  new  problem  would  be  the 
one  with  lowest  (tj  —  ^ )  / Cj .  Relaxing  the  service  times  of  each  but  the  first  customer 
from  Xj  ~  t  back  t0  Xj  might  change  the  order  of  subsequent  customers,  but  the 
first  customer  in  the  optimal  schedule  would  remain  the  same.  Thus,  the  optimal 
sequence  is  one  in  which  the  first  customer  is  the  one  with  lowest  (Tj  —  £)/ci  anb 
scheduled  at  zero,  with  subsequent  customers  in  WSPT  order  and  scheduled  at  £. 

The  previous  propositions  show  that,  when  rh  is  smaller  than  the  smallest 
service,  the  optimal  schedule  and  sequence  can  be  determined  rapidly.  But  when  is 
larger,  it  is  not  clear  how  many  customers  are  to  be  scheduled  before  r^,  complicating 
things  considerably.  Let  Nh  be  the  index  of  the  first  customer  optimally  scheduled 
at  rh  for  a  given  sequence.  Let  dj  =  1  if  customer  i  is  to  be  scheduled  before  rh  and 
zero  otherwise.  Then,  assuming  the  optimal  customer  sequence  can  be  found  for  a 
given  d,  the  goal  is  to  minimize  C(r)  subject  to  J2iLi  diXi  <  Th- 

Now  consider  the  question  of  the  optimal  customer  sequence  given  a  vector  d. 
A  given  d  divides  the  customers  into  those  arriving  prior  to  rh  and  those  arriving  at 
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Tk,  and  both  sets  require  optimal  ordering.  From  the  previous  arguments, 


W:(t) 


C(r) 


0  v?  <  Ni 

ZiZlxi-Th  Vj  >  Ni 

\ 


N 


E 

j=ATi+l 


3- 1 


Ex* 


Th 


(33) 


The  same  optimal  schedule  cost  will  be  obtained  for  every  sequence  in  which  the 
same  set  of  customers  is  scheduled  to  arrive  before  Th,  regardless  of  the  order  of 
this  set.  For  the  set  of  customers  arriving  at  Th,  the  same  argument  as  that  used  in 
Theorem  16  can  be  used  to  show  that  the  ( NH)st  customer  must  be  the  one  from  this 
set  with  smallest  (Xj  ~  Xi  ~  Vi)  /cy  Thus,  once  d  is  selected,  the  sequence  is 

determined.  The  number  of  possible  sequences  to  be  considered  is  thereby  reduced 
from  N\  to  2N .  Now  the  problem  is  reduced  to: 


minimize:  C'(r)  =  Ef=i  dj  max  [o>  Ei=i  Xi  ~  Th\  (34) 

subject  to:  EiLi  diXi  <  Th  an<4  di  G  {0, 1}  Vi 


which  is  a  knapsack  problem,  albeit  one  with  an  objective  that  is  nonlinear,  both 
due  to  the  max  function  and  to  the  summation  changing  with  the  ordering  of  the 
customers.  Thus,  when  dx  changes  value,  the  contribution  of  a  number  of  customers 
to  the  cost  may  change. 

Since  even  the  knapsack  problem  with  linear  cost  function  is  NP-hard,  it  is 
reasonable  to  suspect  this  problem  is  also.  Lawler  addressed  the  problem  of  deter¬ 
mining  the  optimal  sequence  that  minimizes  the  total  weighted  tardiness,  given  due 
dates  (the  1||  problem,  in  currently  accepted  sequencing  theory  notation), 

and  proved  it  is  strongly  NP-hard  [92].  Set  all  due  dates  to  Th  in  the  current  prob¬ 
lem.  Since  the  waiting  time  of  the  jth  customer  is  equivalent  to  the  tardiness  of  the 
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(j  —  l)s£  customer,  this  problem  can  be  thought  of  as  a  variation  on  Lawler’s,  lending 
further  evidence  that  this  problem  is  also  strongly  NP-hard.1 

However,  there  are  situations  in  which  the  optimal  sequence  is  quite  easy  to 
determine.  As  discussed  above,  when  r/,  is  smaller  than  the  smallest  service  time, 
the  optimal  sequence  and  schedule  are  easily  determined,  in  the  time  required  to 
sort  two  sequences  of  length  N.  Another  special  case  can  be  deduced  by  considering 
Equation  (34).  There  are  three  ways  to  manipulate  the  arrival  sequence  to  minimize 
the  cost.  One  is  to  reduce  the  number  of  terms  in  the  summation  by  making  Nj] 
as  large  as  possible.  Another  way  is  to  minimize  the  Cj  for  j  >  NH.  The  last  is 
to  minimize  the  contributions  of  the  £i= 1  X*  expressions.  The  first  and  last  goals 
can  be  accomplished  by  ordering  the  customers  by  smallest  to  largest  service.  The 
second  goal  can  be  accomplished  by  ordering  customers  by  largest  to  smallest  unit 
cost.  If  these  two  orderings  coincide  (i.e.,  if  it  is  true  that  Xi  <  Xj  implies  c*  >  Cj 
Vi,  j),  optimality  is  assured.  A  set  of  customers  possessing  this  property  is  said  to 
have  agreeable  weights.2 

The  sequencing  algorithm  advocated  in  Chapter  V  performs  poorly  on  the  de¬ 
terministic  problem,  and  one  can  set  up  problems  in  which  the  error  in  the  optimum 
found  by  this  method  is  arbitrarily  large.  The  fact  that  the  algorithm  performs  so 
well  on  stochastic  problems  hints  that  some  smoothing  away  of  local  optima  takes 
place,  allowing  the  algorithm  to  attain  the  global  optimum.  This  suggests  that  the 
addition  of  a  similar  smoothing  operation  might  be  a  profitable  way  of  attacking  the 
1 1 1  J2  WjTj  problem. 

In  conclusion,  although  the  deterministic  problem  of  sequencing  and  scheduling 
customers  to  an  appointment  system  is  not  particularly  realistic,  it  does  exhibit  the 
major  features  seen  in  stochastic  problems,  as  was  shown  in  the  examples  in  Chapter 

1  Thanks  to  Michael  Fredley,  Captain,  USAF,  for  pointing  out  this  connection. 

2E.  L.  Lawler  first  used  agreeable  in  this  way  [91]. 
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V.  Even  this  gross  simplification  of  an  appointment  system  appears  be  NP-hard, 
suggesting  that  the  stochastic  sequencing  problem  also  is  NP-hard. 
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Appendix  B.  Application  of  the  Lattice  Algorithms  to  Other 

Problems 

This  appendix  explores  the  implications  of  the  convexity  and  submodularity 
of  the  cost  function  for  the  scheduling  problem.  This  structure  will  be  seen  to  be 
shared  by  other  problems,  and  the  proposed  search  algorithms  may  be  of  use  with 
them  as  well. 

Submodularity,  together  with  convexity  of  the  cost  function  with  respect  to 
each  arrival  time,  imply  an  important  structure  of  the  Hessian  matrix  of  the  cost 
function  (or  the  discrete  analog  to  the  Hessian,  if  the  Hessian  does  not  exist).  Sup¬ 
pose  C(t)  is  twice-differentiable  with  respect  to  r.  First,  since  it  is  convex  with 
respect  to  each  Tj ,  it  must  be  that  >  0  for  all  j . 

The  proof  of  Theorem  2  constructed  S2  from  Si  by  shifting  customer  i's  ar¬ 
rival  time  later  by  6i  and  constructed  S[  (S'2)  from  Si  (S2)  by  shifting  customer 
j's  arrival  time  later  by  6j  (Figure  8).  It  was  then  proved  that  [C^)  —  C(52)]  — 
[C ( S\ )  —  C(Si)]  <  0,  which  holds  for  all  i,j  and  all  positive  values  of  6, ,  6j  that  do 
not  change  the  order  of  arrivals.  Then 


lim 

Sj-*0 


nc(sy  ~  c(s2)]  -  [c(s;)  -  c(s,)n 

_  dC(r) 

dC{r) 

l  ) 

'  8tj 

S2 

dtj 

<  0 


(35) 


which  in  turn  implies  that 


lim 

Si-*  o 


dC{r) 

§£i ll 

dTj 

s2 

drj 

Si_ 

6i 


d2C(r) 


dTjdri 


Si 


<  0 


(36) 


Thus,  if  the  Hessian  exists,  it  has  a  nonnegative  diagonal,  with  nonpositive 
entries  at  all  other  positions.  This  Hessian  structure  of  submodular  functions  was 
first  noticed  by  Lorentz  [101]. 


109 


The  search  algorithms  presented  in  this  chapter  are  based  on  the  cost  function 
being  submodular  as  well  as  convex,  both  with  respect  to  some  vector  that  associated 
a  scalar  with  each  customer.  The  optimal  cost  with  respect  to  this  vector  is  then 
sought.  The  lattice  algorithms  proved  effective  for  the  scheduling  of  arrivals  problem, 
and  it  is  logical  to  seek  other  problem  classes  for  which  the  algorithms  might  be 
useful. 

Topkis  pointed  out  a  number  of  submodular  problems,  including  the  max-flow, 
min-cut  problem,  an  optimal  advertising  strategy  problem,  and  optimal  control  of 
an  unreliable  system  [155].  Other  more  mathematical  problems  have  been  explored 
as  well  [32,  37,  101,  103].  Another  class  of  problems  is  proposed  here. 

One  example  of  this  class  considers  optimization  of  spatial  position  rather  than 
time.  Consider  a  collection  of  point  charges,  each  pair  of  which  repels  each  other 
with  a  force  proportional  to  the  inverse  square  of  the  distance  between  them  and 
to  the  magnitudes  of  the  two  charges.  If  the  first  and  last  point  charges  are  fixed, 
and  the  others  are  constrained  to  move  on  the  line  segment  between  them,  does 
an  equilibrium  exist?  If  so,  what  is  the  equilibrium  position  of  the  charges?  Is  it 
unique?  Here,  the  goal  would  be  to  determine  the  position  vector(s)  for  which  the 
net  force  on  each  charge  is  zero.  Another  equivalent  and  more  pertinent  approach 
would  be  to  equate  cost  with  the  total  potential  energy  of  the  system  and  minimize 
it,  where  the  potential  energy  contribution  of  each  pair  of  charges  is  proportional 
to  the  inverse  of  their  separation  and  to  their  magnitudes.  This  cost  function  is 
submodular  and  convex,  given  a  particular  ordering  of  charges  on  the  line  segment  is 
maintained.  While  such  a  problem  might  not  require  a  lattice  solution,  it  has  been 
seen  that  the  variable-lattice  algorithm  is  superior  to  typical  NLP  methods  for  rough 
approximations  of  the  optimum. 

Cost  functions  in  which  entities  tend  to  “repel”  each  other  within  a  constrained 
area  are  here  termed  jostling  functions.  Such  problems  arise  frequently  in  physics. 
The  optimal  locations  of  electrons  around  a  set  of  nuclei  is  a  common  problem. 
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Other  applications  include  resource  allocation  problems,  such  as  the  positioning  of 
microwave  relay  towers  in  an  area  so  as  to  optimize  signal  strength  and  minimize 
redundant  coverage.  It  will  be  seen  that  jostling  functions  are  submodular  and 
convex  over  a  single  spatial  dimension,  but  only  certain  jostling  functions  retain 
submodularity  for  more  complex  domains. 

In  general,  if  C{r)  is  submodular  for  some  function  C,  it  is  not  true  that 
/(C(r))  is  also  submodular.  In  fact,  unless  C  always  imposes  certain  orderings  on 
the  costs,  Lair  and  Oxley  showed  that  f(x)  must  be  an  increasing  linear  function  [87]. 
Topkis  proved  that  if  C(r)  is  monotone  and  submodular  and  f(x )  is  convex  and 
increasing,  then  /(<7(r))  is  monotone  and  submodular  [155].  One  important  special 
case  occurs  when  C'(r)  is  separable  into  convex  terms,  each  of  which  only  depend  on 
the  difference  of  two  arrival  times:  C(r)  =  Eilt1  Ey^1  Cijijj  ~  Ti),  where  each  Ctj 
is  a  convex  function. 

Theorem  18  Let  Cij  be  convex  functions  for  all  i  and  j.  If 

N  N+ 1 

Cij(TJ  ~  Ti )  (37) 

i~  1  1 

then  C(t)  is  submodular. 

Proof  :  One  need  only  consider  C^,  since  it  is  the  only  term  of  C  that  varies 
with  respect  to  both  i  and  j.  For  notational  convenience,  then,  let  C(t )  =  Cij{Tj—Ti). 
Let  A  =  8i/ ( 8i  +  <5j),  x  =  tj  -  Ti  —  <Sj,  and  y  =  Tj  —  r,  +  6j.  Then 

C(S[)  =  Cij(x)  (38) 

C(S2)  =  C^y) 

C(S: j)  =  Cij(rj-Ti)  =  CtJ(Xx  +  (l-X)y) 

C(S'2 )  =  Cijfa  -  Ti  -  6i  +  6j)  =  Cij((l  -  X)x  +  Xy) 
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Convexity  of  Ci3  is  both  necessary  and  sufficient  for  [134:  Theorem  4.1]: 

CfaXx  +  (1  -  fay)  <  ACij(x)  +  (1  -  \)Cij(y)  (39) 

—  fax  +  Ay)  <  (1  —  X)Cij(x)  +  \Cij{y) 

the  sum  of  which  is 

C{SX)  +  C(S'2)  <  Cij(x)  +  Ctj(y)  =  C(S[)  +  C{S2)  (40) 

which  proves  submodularity.  " 

Corollary  19  Let  fa  be  convex  functions  for  all  i  and  j.  Let  fa  be  convex,  non¬ 
decreasing  functions  for  all  i  and  j.  If  Ffa)  =  £*Li  EyLt+i  fa  ~  r*))>  then 

F(t)  is  submodular. 

Proof  :  Each  fa  {Cfarj  -  fa)  is  convex  with  respect  to  fa  -  fa  [134:  Theorem 
5.1],  so  Theorem  18  applies.  " 

For  many  multi-dimensional  jostling  functions,  the  objective  function  is  not 
separable,  and  submodularity  does  not  hold.  For  instance,  consider  a  set  of  entities 
with  cost  linearly  dependent  on  functions  of  the  Minkowski  distance  between  each: 

fa  fa,  y)  =  9a  ({/I  Xj  -  Xi  |fc  +  |  Vj  -  Vi  |fc  +•••)  (41) 

Merely  by  inspection,  it  is  clear  that  if  and  are  nonzero  they  will 

be  opposite  in  sign.  Then  the  off-diagonal  elements  of  the  Hessian  cannot  all  be 
nonpositive  if  the  optimization  is  over  all  x  and  y. 

On  the  other  hand,  if  for  each  pairwise  interaction,  fa(x,y)  can  be  separated 
into  terms  that  each  involve  only  the  difference  of  two  coordinates,  the  situation 
changes.  For  instance,  consider  the  above  case  if  each  gfaz)  =  c,3zk,  where  cl}  are 
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constants. 


(42) 


fij{x,  y )  =  Cij  (l  Xj  -  xi  \k  +  I  Vi  -  Vi  Ifc  +••■) 


ivrnw  d2fjj(x<y)  —  —  o  and  the  desired  Hessian  structure  holds 

dxjdyj  dxjdyj 


d2fti(x,y)  d2fil(x,y)  d2fjj(x,y) 
dx2>0  ’  dyf>0  ’  dy'f>0  ’ 


. .  for  all  i  and  j. 


if 


d2fa(x,y) 
dxj>6  ’ 


Even  if  submodularity  holds  for  a  problem  with  unconstrained  entities,  it  may 
not  hold  for  constrained  ones.  For  a  jostling  problem  with  n  bounds,  there  will  be  a 
number  of  entities  constrained  to  the  boundaries,  possibly  confounding  the  Hessian 
structure  of  the  problem.  Suppose  that  in  a  2-dimensional  problem,  the  ith  entity 

were  constrained  by  the  boundary  yt  =  h(Xi).  If  -  0  and  =  0  for 

the  unconstrained  problem,  then  by  the  chain  rule,  it  is  now  required  that  >  0. 
While  this  requirement  sometimes  may  be  circumvented  by  redefinition  of  variables 
and  coordinate  systems,  it  is  nonetheless  a  severe  restriction  on  the  nature  of  the 


allowable  boundary  conditions. 


The  algorithm  is  therefore  applicable  only  to  certain  jostling  problems  in  mul¬ 
tiple  dimensions,  limiting  its  usefulness.  On  the  other  hand,  if  the  objective  function 
is  submodular,  the  algorithm  holds  a  substantial  advantage  over  many  NLP  meth¬ 
ods.  For  instance,  a  gradient  search  is  commonly  applied  to  electron  positioning 
problems,  even  though  they  are  nonconvex  and  the  nature  of  the  surface  may  be 
poorly  apprehended;  optimality  is  far  from  guaranteed,  even  after  many  restarts. 
Convexity  is  not  required  for  the  fixed-  or  variable-lattice  algorithms  to  function, 
however.  They  will  fathom  the  feasible  space  until  they  reach  the  potentially  non¬ 
convex  region  and  stop.  The  lattice  algorithms  are  guaranteed  not  to  bypass  any 
optima,  local  or  otherwise.  This  could  prove  useful  in  fathoming  the  search  region 
for  subsequent  analysis. 
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Appendix  C.  Effectiveness  of  the  Lattice  Algorithms 

This  appendix  explores  the  number  of  iterations  required  and  completion  time 
for  the  fixed-  and  variable-lattice  algorithms.  Bounds  are  determined,  as  well  as  the 
functional  dependence  on  the  input  parameters. 

The  fixed-lattice  algorithm  typically  starts  by  finding  SE,  the  early  schedule. 
This  is  not  necessary,  and  it  may  be  far  faster  to  find  Sl  first  for  a  problem  that  is 
quite  constrained  (i.e.,  Th  <  EiLiXi)-  On  the  other  hand,  it  is  necessary  to  start 
by  finding  SE  when  there  is  no  schedule  horizon,  and  more  efficient  to  do  so  if  the 
schedule  horizon  is  very  large.  In  the  following  exposition,  it  is  assumed  that  the 
search  is  started  with  SE. 

C.l  Maximum  Iterations  when  the  Horizon  is  Finite 

For  a  given  problem,  the  number  of  iterations  to  find  SE  can  be  divided  into 
those  that  improve  the  schedule  cost  (and  are  thus  accepted  as  the  new  SE )  and 
those  that  do  not  improve  the  cost.  Call  these  successes  and  failures.  Define  the 
path  of  the  algorithm  to  be  the  sequence  of  schedules  evaluated  to  reach  the  opti¬ 
mum.  The  number  of  successes  encountered  is  independent  of  the  path,  since  for 
any  two  paths,  the  starting  and  ending  schedules  are  identical,  and  each  iteration 
shifts  only  one  customer  one  slot.  The  first  customer  is  fixed  in  all  schedules,  so  the 
number  of  successes  required  to  reach  SE  from  the  starting  schedule  [  0  0  ...  0  ] 
is  SE(j).  It  is  not  possible  to  predict  the  value  of  SE  in  advance,  but  each 
arrival  time  is  bounded  by  the  horizon,  rh,  when  it  exists.  Assume  it  does,  and  let  K 
be  the  number  of  possible  schedule  slots.  Then  the  maximum  number  of  successful 
iterations  occurs  when  SE  =  [  0  K-  1  K  -  1  ...  K-  1  ],  and  that  number  is 
(N  -  1)(K  -  1)  +  1. 

The  maximum  number  of  failures  is  observed  when  every  possible  advancement 
of  an  arrival  time  fails  to  improve  the  cost,  unless  that  failure  would  stop  the  algo- 
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rithm  before  it  reached  the  above  worst-case  SE-  This  occurs  when  the  schedules 
[  q  i  i  1  ],  [  0  2  2  ...  2  ],  •  •  ■  are  in  the  path.  Between  each  of  these 
schedules,  there  can  be  at  most  2N  -  3  evaluations,  N  -  2  of  which  are  failures.  Be¬ 
tween  [  0  K-2  ...  K-  2  ]  and  [  0  K-  1  ...  K  -  1  ],  there  are  no  failures 
possible,  or  the  algorithm  would  end  before  reaching  the  worst-case  SE-  The  total 
number  of  failures  is  thus  bounded  by  {N  —  2)  (A  2),  making  the  total  number 
of  iterations  to  reach  SE  bounded  by  (2 N  -  3) (K  -  2)  +  N.  An  example  of  such  a 
worst-case  search  is  shown  in  Table  14. 

Table  14.  Example  of  worst-case  search  for  SE  using  the  fixed-lattice  algorithm,  for 
4  customers  and  5  schedule  slots. 

[0  0  0  0  ]  start 

[0  0  0  1  ]  success 

[0  0  0  2  ]  failure 

[0  0  1  1  ]  success 

[0  0  1  2]  failure 

[0  1  1  1  ]  success 

[0  1  1  2  ]  success 

[0  1  1  3  ]  failure 

[0  1  2  2  ]  success 

[0  1  2  3  ]  failure 

[0  2  2  2  ]  success 

[0  2  2  3  ]  success 

[0  2  2  4  ]  failure 

[0  2  3  3  ]  success 

[0  2  3  4  j  failure 

[0  3  3  3  ]  success 

[0  3  3  4  ]  success 

[0  3  4  4  ]  success 

[0  4  4  4  ]  success 

The  starting  point  for  the  search  for  SL  is  obtained  by  shifting  all  but  the  first 
customer  of  SE  one  slot  later  if  possible.  For  the  above  worst-case  SE,  no  shifts  are 
possible.  Any  earlier  choice  of  worst-case  SE  would  require  at  most  N  -  1  further 
iterations  to  obtain  SL,  but  this  increase  would  be  more  than  offset  by  the  decrease  in 
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iterations  required  to  obtain  SE.  Therefore,  the  above  is  also  a  bound  to  the  number 
of  iterations  required  to  obtain  both  SE  and  SL.  This  is  a  substantial  improvement 
on  Simeoni’s  bound  of  2 (K  -  1  )2(iV  -  1)  [145]  for  the  same  algorithm.  In  the  case 
of  the  variable-lattice  algorithm,  a  bound  on  the  number  of  evaluations  required  for 
each  subsequent  determination  of  Sl  or  SE  can  be  found  by  a  similar  argument. 

For  the  fixed-lattice  algorithm  or  the  last  stage  of  the  variable-lattice  algorithm, 
it  may  be  that  SE  and  SL  differ  in  the  arrival  times  of  a  number  of  customers, 
necessitating  the  enumeration  of  some  of  the  schedules  between  SE  and  Sl-  There 
are  at  most  2N~l  such  schedules,  since  at  most  N  -  1  customers  can  differ,  and  by 
Theorem  8,  each  of  these  can  differ  by  at  most  one  slot.  This  is  the  bound  Simeoni 
proposed  [145],  However,  for  all  values  of  N  above  some  minimum,1  2N~1  is  greater 
than  the  number  of  all  feasible  schedules,  which  limits  its  usefulness  as  a  bound.  The 
problem  is  that  many  of  these  2N"1  enumerations  are  infeasible,  since  the  customers 
change  order.  These  enumerations  must  be  subtracted  from  Simeoni  s  bound. 

The  only  situation  in  which  customers  might  change  order  in  the  enumeration 
phase  is  when  two  customers  occupy  the  same  slot  in  SE  and  one  customer  is  shifted 
one  slot  later  while  the  subsequent  one  is  not.  Let  v{j)  be  the  number  of  customers 
arriving  in  slot  j.  The  number  of  feasible  schedules  achievable  by  shifting  only  the 
v(j)  arrival  times  in  slot  j  by  A  is  equivalent  to  the  number  of  v(j)~ digit  binary 
numbers  in  which  the  bits  are  ordered  from  lowest  to  highest,  which  easily  is  proved 
inductively  to  be  u(j)  + 1.  In  addition,  SE  and  SL  themselves  are  already  evaluated, 
as  are  all  the  schedules  that  differ  from  SE  or  SL  in  only  one  arrival  and  by  one  slot, 
making  the  maximum  number  of  schedules  to  be  evaluated  in  the  enumeration  phase 

K-2  K~2 

1/(0)  n  Mi)  +  1)  -  2  -  2  sign(i/(0)  -  1)  -  2  £  sign (v(j))  (43) 

3= 1  j=1 

iThis  minimum  is  dependent  on  K.  For  example,  when  K  =  5,  N  >  11  yields  a  value  of  2N  1 
greater  than  the  total  number  of  feasible  schedules,  and  when  K  =  20,  N  >  60  gives  the  same 

result. 
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where  sign(x)  is  zero  if  x  =  0  and  one  otherwise.  The  j  =  0  terms  of  the  product 
and  summation  are  evaluated  separately  because  of  the  special  condition  that  the 
first  customer  remain  in  the  initial  slot.  The  K  —  1  terms  have  no  effect,  since  those 
customers  in  the  last  possible  slot  cannot  be  shifted  later. 


Table  15.  Example  of  enumeration  for  4  customers,  5  schedule  slots,  SE  - 


[ 

0 

1 

1 

2  3  ] 

,  and  Si  =  [  0  1  2 

3  4  ] 

schedule 

binary  analogue 

evaluate? 

[0 

[0 

[0 

1 

1 

2 

3  ] 

0000 

no: 

already  evaluated 

1 

1 

2 

4  ] 

0001 

no: 

already  evaluated 

1 

1 

3 

3] 

0010 

no: 

already  evaluated 

[  0 

1 

1 

3 

4  ] 

0011 

yes 

[0 

1 

2 

2 

3  ] 

0100 

no: 

already  evaluated 

[0 

1 

2 

2 

4] 

0101 

yes 

[0 
[  0 

1 

2 

3 

3] 

4] 
3] 

0110 

yes 

1 

2 

3 

0111 

no: 

already  evaluated 

[  0 

2 

1 

2 

1000 

no: 

infeasible 

[0 
[0 
[0 
[ 0 

2 

1 

2 

4  ] 

1001 

no: 

infeasible 

2 

1 

3 

3] 

4] 

1010 

no: 

infeasible 

2 

1 

3 

1011 

no: 

infeasible 

2 

2 

2 

3  ] 

1100 

yes 

[ 0 

2 

2 

2 

4  ] 

1101 

no: 

already  evaluated 

[ 0 

2 

2 

3 

3] 

4] 

1110 

no: 

already  evaluated 

f  0 

2 

2 

3 

mi 

no: 

already  evaluated 

The  number  of  evaluations  during  the  enumeration  phase  is  maximal  when  the 
arrival  times  are  most  evenly  distributed  between  the  first  through  the  (K  -  l)st 
slots.  When  N  is  divisible  by  K  -  1,  the  number  of  evaluations  in  the  three  phases 
is  bounded  by 


(2N  -  3)(K  -  2)  +  N  + 


1  + 


N 


,  2 


K  -  1 


(— 

\K-  1 


2  K 


(44) 


and  a  similar  expression  can  be  obtained  when  N  is  not  divisible  by  A  1.  This 
quantity  is  plotted  in  Figure  11. 
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As  N  increases,  the  ratio  of  the  maximum  number  of  iterations  required  by 
the  algorithm  to  the  total  number  of  feasible  schedules  approaches  some  asymp¬ 
totic  value.  This  value  can  be  obtained  analytically  by  using  Stirling’s  asymptotic 
approximation  to  a  factorial. 


lim 

N— *oo 


(2N  -  Z)(K  -  2)  +  N+ 

[HjfUHy 


(N+K- 2\ 

l  N-l  ) 


1  ^(n  -i)n-°-5(k  -i)k~0-5 

(N  +  K  -  2 )N+K-i.s 

y/MK-  i)  (^)K" 

-  J1DL  /  Z  ^N-l  /""  Z  .  \  K—0.5 


Jmk  - 1) 

exp(/f  —  1) 


(45) 


This  limiting  ratio  is  about  0.092  for  K  =  5  and  0.00093  for  K  =  10,  and  is  reached 
from  below,  implying  that  even  in  the  worst  case,  only  a  fraction  of  the  possible 
schedules  must  be  evaluated.  In  practice,  the  enumeration  phase  almost  always  is 


118 


quite  small;  it  seldom  exceeds  10  even  for  very  large  N,  so  the  expected  number 
of  iterations  is  represented  better  by  the  lower  line  in  Figure  11.  If  the  number  of 
enumerations  required  is  assumed  to  be  bounded  and  small,  the  number  of  iterations 
is  linear  in  both  N  and  K. 

C.2  Actual  Number  of  Iterations 

The  actual  number  of  iterations  required  in  the  enumeration  phase  was  ex¬ 
plored  in  a  series  of  tests.  For  N  <  5,  enumeration  is  never  required,  since  Se  and 
Sl  can  differ  by  at  most  three  customers.  In  77,312  trials  of  a  6-customer,  11-slot 
schedule  with  randomly  (but  realistically)  chosen  parameters,  86.7%  required  no 
enumeration,  9.8%  required  an  enumeration  phase  of  4  iterations,  and  3.5%  required 
the  maximal  18  iterations  in  the  enumeration  phase.  A  similar  10-customer,  21-slot 
series  of  3000  schedule  optimizations  yielded  the  CDF  seen  in  Figure  12.  The  max¬ 
imal  number  of  iterations  in  the  enumeration  phase,  492,  was  observed  1.1%  of  the 
time. 


iterations  in  the  enumeration  phase 

Figure  12.  Estimated  CDF  of  the  number  of  iterations  required  in  the  enumera¬ 
tion  phase  when  optimizing  a  10-customer,  21-slot  schedule. 
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The  linearity  of  the  theoretical  dependence  of  the  maximum  number  of  itera¬ 
tions  on  the  number  of  possible  schedule  slots  is  matched  by  empirical  results  for  the 
typical  actual  number  of  iterations,  as  seen  in  the  example  in  Figure  13.  Here,  five 
customers  were  assigned  iid  Erlang-2  service  with  mean  of  2.  The  number  of  slots 
was  varied  and  evenly  spaced  between  0  and  5,  with  tv  =  5  and  ci  =  c-i  —  . . .  —  c§. 
The  points  are  observed  values,  while  the  solid  line  is  the  regression  line.  The  dotted 
line  represents  the  theoretical  bound  on  the  number  of  iterations  for  the  algorithm 
if  no  enumerations  are  required,  which  was  found  above. 

The  remarkably  linear  fit  seen  in  Figure  13  (r2  =  0.99998,  slope  =  0.96)  can 
be  understood  by  means  of  an  argument  similar  to  that  used  for  the  maximum 
number  of  iterations.  By  Theorem  9,  the  search  for  SE  will  end  at  the  nearest  or 
next-to-nearest  lattice  point  that  is  greater  than  the  continuous  optimum  schedule 
for  each  arrival.  This  limits  the  maximum  number  of  iterations  for  each  lattice  size 
to  approximately  the  same  horizon,  leading  to  a  formulation  similar  to  that  above 
for  the  number  of  successes.  The  number  of  failures  is  typically  negligible  compared 
to  the  number  of  successes.  The  empirical  fit  would  be  expected  to  be  poor  if  the 
enumeration  phase  required  a  disparate  number  of  evaluations  for  the  13  problems. 
For  this  example,  none  of  the  optimizations  required  an  enumeration  phase  longer 
than  6  schedules,  and  the  two  problems  with  the  largest  lattice  size  required  no 
enumeration  phase.  This  is  typical. 

Likewise,  the  actual  dependence  of  the  number  of  iterations  on  N  is  highly 
linear,  as  seen  in  the  example  in  Figure  ll(r2  =  0.9995).  Here,  the  horizon  and 
overtime  point  are  fixed  at  7  units,  all  cost  coefficients  are  equal,  and  customers  are 
placed  into  8  slots.  No  optimization  required  an  enumeration  phase  of  more  than  3 
evaluations.  The  solid  line  is  a  linear  regression,  while  the  dotted  line  represents  the 
theoretical  bound  on  the  number  of  iterations  for  the  algorithm  if  no  enumerations 
are  required. 
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schedule  slots 


Figure  13.  Total  number  of  iterations  required  vs.  the  number  of  schedule  slots 
for  a  set  of  fixed-lattice  problems  with  5  customers.  The  scale  is  log¬ 
arithmic,  and  the  dependence  of  the  number  of  iterations  on  Kl  is 


evident. 


Figure  14.  Total  number  of  iterations  required  vs.  the  number  of  customers  for 
a  set  of  fixed-lattice  problems  with  8  slots.  The  dependence  of  the 
number  of  iterations  on  N 1  is  evident. 


The  linearity  of  the  number  of  iterations  required  by  the  fixed-lattice  algorithm 
with  respect  to  TV  and  K  is  only  matched  by  linearity  with  respect  to  program 
execution  time  if  the  time  required  for  each  schedule  evaluation  is  independent  of 
TV  and  K.  In  fact,  this  is  not  the  case  for  the  proposed  evaluation  approach  for 
non-identical  customers.  The  number  of  matrix  multiplications  required  for  each 
evaluation  is  O(N).  The  number  of  flops  required  for  each  matrix  multiplication 
is  dependent  on  the  cube  of  the  size  of  Q,  which  is  (EyLi^G1)),  where  r(j)  is  the 
number  of  exponential  service  phases  for  the  jth  customer.  For  iid  services,  the 
number  of  flops  for  each  matrix  multiplications  is  simply  0(N 3).  Thus  the  run  time 
of  an  optimization  for  this  simplified  case  is  0(TV4).  An  example  of  this  dependence 
is  shown  in  Figure  15.  Here,  the  same  problem  was  used  as  that  used  to  show  the 
dependence  of  the  number  of  iterations  on  TV.  The  slope  of  the  lower  section  of 
this  log-log  graph  is  0.94,  while  the  upper  portion  has  slope  of  3.87.  This  is  highly 
suggestive  of  the  conjectured  TV 4  dependence  for  higher  values  of  TV.  For  lower  values, 
it  is  conjectured  that  the  time  consumed  in  preliminary  calculations  such  as  matrix 
exponentiation  dominates  the  time  spent  in  repetitive  matrix  multiplication. 

Figure  16  shows  the  dependence  of  run  time  on  K  for  the  same  set  of  problems 
that  were  used  to  show  dependence  of  the  number  of  iterations  on  K.  The  calculation 
of  (exp(QA))^_Ti-1^A  is  performed  for  each  j  e  [2,  TV]  by  repetitive  multiplication 
of  exp(QA),  so  each  schedule  evaluation  requires  XTjL2(tj  —  Tj- i)/A  =  K  —  1  of 
these  matrix  multiplications,  making  the  number  of  flops  required  for  evaluation 
O(K)  (assuming  the  last  schedule  slot  is  occupied  by  the  Nth  customer).  Therefore, 
the  optimization  run  time  is  0(K 2).  This  dependence  is  seen  clearly  for  larger  values 
of  K. 

This  dependence  of  run  time  on  TV4  and  K2  leads  to  run  times  of  over  an 
hour  for  TV  =  100  and  K  —  10  for  two-phase  services,  using  a  133  MHz  PentiumTw 
processor.  For  larger  problems,  it  may  be  advisable  to  pursue  an  evaluation  algorithm 
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that  assumes  iid  or  Erlang  services.  The  fixed-lattice  optimization  algorithm  will  still 
prove  effective  for  other  cost  evaluation  approaches. 

C.3  Comparison  to  Other  Optimization  Algorithms 

It  is  worthwhile  to  compare  the  worst-case  fixed-lattice  search  to  a  worst-case 
cyclic  coordinate  search  for  SE ■  Compare  Tables  14  and  16.  The  worst-case  cyclic 
coordinate  search  must  also  evaluate  [  0  j  ...  j  ]  for  j  e  [0,K  -  1}.  Between 
these  K  “markers”,  there  can  be  at  most  N  -  2  successes  and  one  failure,  and  no 
failure  is  possible  between  the  last  two  markers,  making  a  total  of  (K  —  1){N  — 
_|_  k  —  2  +  1  =  N(K  —  1)  evaluations.  In  the  limit  as  N  approaches  infinity, 
the  ratio  of  the  worst-case  number  of  fixed-lattice  evaluations  to  the  number  of 
cyclic  coordinate  evaluations  is  2  -  so  the  fixed-lattice  search  can  be  thought 
of  as  being  approximately  half  as  efficient  as  a  simple  cyclic  coordinate  search  on  a 
lattice.  Of  course,  there  is  no  assurance  a  cyclic  coordinate  search  will  converge  to 
the  optimum  for  these  schedule  problems,  but  this  comparison  establishes  a  rough 
“price  tag”  on  the  desire  to  obtain  the  precise  lattice  optimum,  rather  than  an 
approximation. 

If  only  an  approximation  to  the  optimum  is  desired,  a  reasonable  approach 
is  to  employ  a  nonlinear  program  (NLP),  obtaining  an  approximation  to  the  con¬ 
tinuous  solution  and  then  choosing  the  closest  lattice  schedule.  For  a  number  of 
problems,  quasi-Newton  and  Nelder-Mead  searches  were  performed,  using  the  equi- 
spaced  schedule  as  a  starting  point.  Both  searches  were  set  to  halt  when  the  search 
was  narrowed  to  a  region  less  that  A  in  width  in  each  direction.  IMSL™  routines 
UMINF  and  UMPOL  were  used.  The  results  were  compared  to  those  of  a  fixed- 
lattice  approximation  algorithm  to  find  SE  and  SE,  starting  at  some  schedule  S  for 
which  there  was  reasonable  assurance  that  S'  >  S  or  S'  <  S. 

For  a  problem  with  10  slots,  6  customers,  identical  exponential  services  with 
mean  of  A,  and  all  cost  coefficients  set  equal,  the  Newton-Raphson  algorithm  re- 
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Table  16.  Example  of  a  worst-case  search  for  Se  using  a  cyclic  coordinate  algorithm. 
N  =  4  and  K  =  5 
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quired  53  evaluations  and  the  Nelder-Mead  algorithm  required  130  evaluations  to 
find  an  approximate  solution  to  the  optimal  schedule.  The  fixed-lattice  approxima¬ 
tion  was  started  from  [  0  0  2  4  6  8  ]  and  required  19  evaluations.  Even  when 
the  lattice  approximation  was  started  from  the  earliest  possible  schedule,  it  required 
only  53  evaluations,  still  competitive  with  NLP  methods.  NLP  approaches  tend  to 
perform  better  compared  to  the  fixed-lattice  algorithm  as  the  lattice  size  decreases, 
while  the  fixed-lattice  approximation  tends  to  perform  relatively  better  when  the 
number  of  customers  is  greater. 

C-4  Comparison  of  the  Fixed-Lattice  Algorithm  to  Liao  s  Algorithm 

Liao’s  scheme  is  the  only  other  known  approach  to  finding  the  lattice  optimum 
of  an  appointment  system  [96,  95,  97].  He  employed  an  effective  branch-and-bound 
technique,  using  the  solution  of  the  associated  dynamic  scheduling  problem  as  an 
upper  bound  at  each  stage  of  the  static  problem.  This  leads  to  efficient  solutions  for 
scheduling  problems  with  iid  Erlang-r  services. 
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Liao’s  approach  makes  powerful  use  of  recursive  approaches  for  finding  the  cost 
of  one  schedule  from  another  in  which  only  one  customer  has  been  shifted  one  slot. 
This  recursive  approach  to  evaluation,  which  requires  far  less  calculation  than  a  full 
evaluation  when  services  are  iid  Erlang-r,  does  not  appear  to  be  easily  effected  for 
the  case  of  idd  Coxian  services  with  no-shows.  This  is  an  area  for  future  research, 
since  Liao’s  optimization  scheme  could  be  quite  effective. 

Because  Liao’s  recursive  approach  was  such  an  integral  part  of  his  conception, 
it  is  difficult  to  compare  the  two  methods.  Table  17  shows  the  effectiveness  of  his 
algorithm  in  terms  of  the  number  of  partial  schedule  evaluations,  while  it  shows  the 
fixed-lattice  algorithm  effectiveness  in  terms  of  full  evaluations.  In  light  of  the  above 
discussion,  however,  this  seems  a  good  basis  for  comparison;  it  seems  reasonable 
to  suppose  that  either  his  recursive  evaluation  approach  could  be  employed  in  the 
fixed-lattice  optimization  or  that  such  a  scheme  would  have  to  be  abandoned  in  his 
algorithm  if  it  were  to  encompass  idd  Coxian  distributions  with  no-shows. 

The  problem  used  by  Liao  used  iid  exponential  services  with  mean  of  1.6,  with 
no  schedule  horizon  and  essentially  an  overtime  point  of  zero  [97].  The  ratio  of  the 
cost  of  overtime  to  waiting  time  was  3.  It  is  unclear  how  he  modified  the  lattice  size 
between  runs,  so  the  fixed-lattice  algorithm  was  run  using  a  lattice  size  of  5/(jFsT  —  1). 
Because  this  particular  problem  results  in  very  fast  evaluations  for  the  fixed-lattice 
algorithm,  the  overtime  point  was  shifted  to  5.0  for  a  more  representative  comparison. 

In  the  first  two  runs  below,  it  can  be  seen  that  Liao  reports  requiring  more 
evaluations  required  than  the  total  number  of  feasible  schedules.  This  discrepancy 
appears  due  to  his  not  recognizing  the  first  customer’s  arrival  time  is  fixed,  which 
increases  the  number  of  schedules  his  approach  had  to  fathom  by  a  factor  of  N+K-l. 

Liao’s  program  apparently  was  unable  to  handle  K  >  24  or  N  >  0.2 K  for  large 
values  of  K ,  due  to  excessive  run  times  on  an  Intel  80386  processor,  so  comparisons 
are  limited  [97].  Liao’s  approach  is  apparently  superior  in  the  case  of  N  =  8,  K  =  6, 
but  in  the  other  runs,  the  fixed-lattice  algorithm  appears  much  more  efficient.  This 
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is  true  despite  abnormally  large  enumeration  phases  in  the  last  two  cases  (112  and 
50  evaluations,  respectively).  Regardless  of  comparative  effectiveness  for  these  small 
problems,  it  is  clear  that  as  N  and  K  increase,  the  fixed-lattice  algorithm  becomes 
relatively  more  effective,  and  that  even  if  Liao’s  approach  were  superior  for  some 
set  of  smaller  problems,  the  fixed-lattice  approach  would  surpass  it  at  some  problem 


size. 


Table  17.  Comparison  of  fixed-lattice  and  Liao’s  results  [97] 
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C.5  Effectiveness  of  the  Sequencing  Algorithm 

Each  iteration  of  the  sequencing  heuristic  proposed  in  Chapter  V  is  based 
on  determining  the  optimal  schedules  for  each  of  the  schedules  that  can  be  obtained 
from  the  current  optimum  by  a  pairwise  swap,  and  selecting  the  one  with  lowest  cost. 
The  number  of  schedule  optimizations  required  to  reach  the  conjectured  optimum 
is  therefore  the  product  of  the  number  of  iterations  required  and  (f ),  the  number 
of  swaps  per  iteration.  In  the  worst  case,  the  number  of  iterations  required  could 
be  factorial.  However,  as  seen  from  Table  12,  the  average  number  of  iterations  is 
rather  low,  and  the  maximum  number  of  iterations  required  is  substantially  less  that 
the  number  apparently  possible.  The  number  of  iterations  appears  exponentially 
distributed,  and  a  large  number  of  iterations  is  seldom  encountered.  This  suggests 


127 


that,  while  the  sequencing  problem  is  almost  certainly  NP-hard,  the  sequencing 
algorithm  is  usually  polynomial  with  respect  to  N.  This  is  a  similar  result  to  that 
obtained  for  the  scheduling  problem.  It  is  also  analogous  to  the  application  of  simplex 
methods  to  linear  programming  problems;  the  problem  is  NP-hard,  but  the  algorithm 
usually  performs  in  polynomial  time  [15]. 
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Appendix  D.  Sensitivity  Analyses 

This  section  empirically  explores  the  sensitivity  of  the  optimal  schedule  and 
cost  to  cost  coefficients,  no-show  rate  and  service  distribution  moments.  The  exam¬ 
ples  shown  are  all  excursions  from  a  basic  schedule  with  five  customers  constrained 
to  arrive  within  a  time  period  5  units  long,  divided  into  101  slots.  The  start  of 
overtime  is  set  to  5.0.  Unit  costs  for  expected  waiting  times  and  idle  times  are  set  to 
1.  As  a  baseline,  customers  have  a  show  probability  of  1,  and  service  distributions 
are  iid  with  mean  of  1.0. 

The  plots  of  the  examples,  Figures  17  through  20,  share  several  conventions. 
For  the  optimal  schedule  plots,  the  arrival  time  for  the  first  customer  is  omitted, 
since  it  is  always  zero.  The  data  for  each  customer  in  the  optimal  schedule  plots 
are  connected  by  straight  lines.  These  lines  are  not  data  fits  and  serve  merely  to 
assist  the  reader  in  visually  assembling  the  data.  On  the  other  hand,  the  data  in 
the  optimal  cost  and  expected  waiting  time  plots  are  shown  as  unconnected  points. 
Any  lines  that  appear  to  connect  these  data  are  either  empirical  or  theoretical  fits, 
as  discussed  in  each  section. 

D.l  Dependence  of  Optimum  on  Cost  Coefficients 

The  plots  in  Figures  17  and  18  exemplify  the  dependence  of  optimal  cost  and 
schedule  on  the  individual  cost  coefficients.  This  dependence  has  already  been  dis¬ 
cussed  in  passing  in  Chapter  IV  and  Figure  7.  In  this  example,  the  customer  services 
are  iid  Erlang-4  distributions  with  mean  of  1.0,  and  show  probabilities  are  all  1.0. 
Cost  coefficients  other  than  the  one  in  question  are  held  constant  at  1.0. 

As  the  relative  size  of  c6,  the  overtime  coefficient,  decreases  in  Figure  17,  the 
optimal  schedule  and  cost  approach  the  limit  defined  by  setting  c6  =  0.  As  the 
relative  size  increases  without  bound,  the  optimal  schedule  is  forced  to  the  earliest 
possible  schedule.  The  cost  due  to  the  expected  waiting  times  in  this  case  would 
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approach  1+2+3+4  =  10,  while  the  cost  due  to  overtime  would  approach  E[VF6]c6  = 
0.444C6.  A  least-squares  fit  of  the  upper  four  cost  data  points  yields  C  =  4.5+0.444C6. 
The  discrepancy  in  intercept  is  due  to  the  fact  that  the  optimal  arrival  times  have 
not  yet  reached  zero. 

The  three  plots  in  Figure  18  exemplify  the  dependence  of  optimal  schedule 
and  cost  on  the  cost  coefficient  of  a  single  customer.  Each  plot  can  be  thought  of  as 
being  divided  into  three  regions:  C3  <  1,  1  <  C3  <  104,  and  c3  >  104.  In  the  first 
region,  the  plots  show  that  as  the  importance  of  the  third  customer  decreases,  the 
schedule  and  cost  approach  that  of  a  four-customer  system  with  all  cost  coefficients 
equal,  but  with  the  second  customer  having  twice  as  many  phases,  and  thus  twice 
the  expected  service.  This  region  is  characterized  by  all  optimal  arrival  times  and 
expected  waiting  times  but  the  third  being  relatively  insensitive  to  C3.  Apparently 
in  this  region,  changes  in  the  position  of  the  third  customer  have  little  effect  on  the 
optimal  expected  waiting  times,  and  thereby  positions,  of  subsequent  customers. 

In  the  region  1  <  c3  <  104,  the  optimal  arrival  times  of  the  fourth  and  fifth 
customers  shift  and  begin  to  become  constrained  by  the  horizon.  This  causes  an 
appreciable  increase  in  each  expected  waiting  time  but  the  second,  and  the  cost 
increases.  A  least-squares  fit  to  a  power  function  yields  C  —  O.I8C317  with  good 
precision.  An  analytical  reason  for  this  fit  is  not  forthcoming. 

Only  in  the  region  C3  >  104,  when  the  optimal  arrival  times  of  the  other 
customers  are  completely  constrained  by  the  horizon,  do  the  optimal  r2  and  E[W2] 
shift  appreciably.  The  fact  that  the  arrival  times  of  the  fourth  and  fifth  customers 
are  fixed  at  the  horizon  suggests  that  E  [W4  +  W5  +  W6]  approaches  some  amount  in 
excess  of  1  +  2  -I-  3  =  6,  which  can  be  seen  in  the  second  and  third  plots.  Since  the 
expected  wait  of  customer  3  is  2.8  ■  10~3  at  this  extreme  position,  one  would  expect 
the  cost  to  be  approximately  C  =  2.8-10_3C3  +  6+3-2.8-10~3  =  2.8-10_3C3  +  6.0084. 
Regression  yields  C  —  2.8  •  10-3c3  +  6.33.  The  discrepancy  in  this  case  is  due  to  the 
second  customer’s  expected  wait  increasing  as  its  optimal  arrival  time  decreases. 
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When  the  second  customer’s  optimal  arrival  time  reaches  zero,  the  expected  form  is 
C  —  2.8  •  10_3c3  +  7.0,  which  was  indeed  observed  to  good  accuracy. 

D.2  Dependence  of  Optimum  on  Show  Probability 

The  plots  in  Figures  19  and  20  exemplify  the  dependence  of  optimal  cost  and 
schedule  on  the  show  probability.  The  customer  services  are  iid  Erlang-4  distri¬ 
butions  with  mean  of  1.0.  When  show  probabilities  for  all  customers  are  reduced 
simultaneously,  the  optimal  arrival  times  decrease  regularly.  For  this  example,  as  7 
is  decreased  from  1.0  to  0.6  (the  typical  range  for  medical  applications),  the  decrease 
is  between  15%  and  17%  for  each  customer’s  optimal  arrival  time.  For  a  service  mean 
of  4.0,  a  choice  for  which  server  overtime  plays  a  larger  role,  this  shift  increases  to 
about  25%. 

The  optimal  cost  under  this  shared  show  probability  is  seen  to  fit  very  closely 
the  curve  C  =  0.99y2  (correlation  coefficient  of  0.9996).  The  form  of  the  equation 
for  this  particular  example  is  misleadingly  simple.  In  general,  a  good  approximation 
can  be  obtained  by  the  2-parameter  form  C  =  ai7  +  a272,  with  correlation  coefficient 
over  0.999. 

Further,  while  it  is  tempting  to  seek  a  simplistic  explanation  for  the  above 
equation  fit,  the  reader  should  be  warned  that  linear  regressions  over  a  number  of 
examples  indicate  that  individual  expected  waiting  times  of  customers  are  better  fit 
by  E [Wj]  —  a2l2  +  G373  than  by  the  form  above. 

Figure  20  shows  the  effect  on  the  optimal  schedule  and  cost  of  varying  73,  the 
show  probability  of  the  third  customer,  holding  other  show  probabilities  constant  at 
1.0.  As  73  decreases,  the  arrival  time  of  customer  3  also  decreases  slowly,  as  might 
be  expected.  The  other  customers  shift  gradually  to  their  optimal  arrival  times  in  a 
four-customer  schedule,  as  does  the  cost. 

The  third  customer’s  position  as  73  — ♦  0  can  be  understood  by  realizing  that  its 
scheduled  arrival  time  has  negligible  effect  on  the  scheduled  arrival  times  of  the  other 
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customers,  but  when  it  does  arrive,  the  costs  due  to  both  E[W3]  and  E[W4]  increase 
(as  do  those  of  subsequent  customers,  but  to  a  negligible  extent).  The  optimal  r3 
will  balance  these  costs.  Compare  this  case  to  the  first  plot  in  Figure  18,  where  as 
c3  -+  0,  the  contribution  of  E[W4]  is  nearly  constant,  while  that  of  E[W3]  goes  to 
zero,  forcing  the  optimal  r3  to  that  of  Note  that  the  near-constant  contribution 
of  E[VE4]  as  c3  -»  0  forces  the  optimal  r4  and  r5  much  later  than  the  optimal  r4  and 

t 5  when  q3  — >  0. 

The  customers  adjacent  to  the  modified  customer  will  be  most  affected  by  shifts 
in  its  show  probability;  as  the  show  probability  changes  from  1.0  to  0.6,  the  fourth 
customer’s  optimal  arrival  time  is  reduced  by  8.5%,  while  the  second  customer  s 
arrival  time  is  increased  by  4.8%.  Changing  each  customer’s  mean  to  2.0  or  0.5 
reduces  the  magnitude  of  these  shifts,  so  in  some  sense,  this  is  a  worst  case.  The 
solid  curve  on  the  cost  plot  is  a  quadratic  fit  (correlation  coefficient  of  0.9998). 

D.3  Dependence  of  Optimum  on  Service  Distribution  Mean 

Figure  21  exemplifies  the  effect  as  the  means  of  each  service  distribution  are 
varied  together.  As  the  mean  tends  to  zero,  the  optimal  schedule  tends  toward  equi- 
spacing,  and  the  cost  tends  to  zero.  As  the  mean  increases,  each  customer’s  optimal 
scheduled  arrival  time  tends  toward  the  schedule  horizon;  there  is  little  possible 
advantage  accrued  by  scheduling  them  earlier,  since  it  is  almost  certain  that  the  first 
customer’s  service  will  be  greater  than  the  horizon.  The  optimal  cost  in  this  case 
tends  toward  the  worst-case  cost  when  a  horizon  is  imposed,  which  in  the  case  of 
Cl  =  c2  =  •  •  •  =  cjv+i  =  1.0  is  the  product  of  the  service  mean  and  N(N  + 1)/2.  This 
asymptote  is  shown  as  a  solid  line  in  the  log-log  plot  of  cost. 

Figure  22  exemplifies  the  effect  as  the  mean  of  only  one  customer  (in  this  case 
the  third)  is  varied.  As  the  mean  decreases,  the  optimal  schedule  and  optimal  cost 
tend  toward  that  of  a  four-customer  system.  The  horizontal  solid  line  in  the  log- 
log  cost  plot  represents  this  four-customer  cost.  As  the  mean  increases,  the  fourth 
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and  fifth  customers  are  forced  to  the  horizon,  while  the  second  and  third  find  an 
equilibrium.  Beyond  some  point,  the  cost  is  dominated  by  the  contribution  of  the 
third  customer  to  the  expected  waiting  times  of  the  subsequent  customers  and  the 
server  overtime,  each  of  which  tends  to  the  mean  of  the  third  customer  reduced  by 
the  difference  in  optimal  arrival  time  between  the  subsequent  customers  (5.0)  and 
the  third  customer  (1.5).  The  solid  curve  in  the  log-log  cost  plot,  C  =  mean  —  3.5, 
represents  this  asymptotic  cost. 

D.4  Dependence  of  Optimum  on  Standard  Deviation  of  Service  Distribution 

Figures  23  and  24  exemplify  the  dependence  of  optimal  cost  and  schedule  on 
the  standard  deviation  of  the  service.  The  mean  for  each  customer  was  held  at  1.0. 
Each  service  distribution  was  modeled  as  an  Erlang-r  distribution  if  c,  the  coefficient 
of  variation,  was  less  than  1.0,  and  as  a  Coxian-2  distribution  otherwise.  In  the  latter 
case,  the  third  moment  was  set  at  its  minimum  value. 

The  behavior  of  the  optimal  schedule  as  each  customer’s  standard  deviation 
is  varied  is  complex;  starting  from  deterministic  services,  the  arrival  times  are  first 
shifted  later  when  a  is  increased,  but  as  later  arrivals  are  constrained  by  the  horizon, 
the  trend  for  earlier  customers  reverses.  As  o  increases,  it  dominates  other  consid¬ 
erations,  and  customers  are  polarized,  arriving  either  at  the  earliest  possible  time  or 
the  latest. 

When  the  schedule  is  close  to  deterministic,  the  cost  is  linear  with  respect  to 
variance  (slope=3.65,  intercept=0.02,  and  correlation  coefficient  of  0.999  for  c  <  2 
in  this  example).  As  the  schedule  becomes  polarized,  with  each  customer  arriving  at 
either  the  latest  or  earliest  possible  time,  the  cost  approaches  the  maximum  schedule 
cost,  as  discussed  above.  Here,  the  mean  is  constant  at  1.0,  and  that  maximum  cost 
is  15.0.  The  solid  curve  in  the  cost  plot  is  the  least-squares  fit  of  the  upper  five  data 
points  to  C  =  ai  -  a2(ra3,  which  yields  a3  =  15.004. 
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As  only  the  standard  deviation  of  the  third  customer  (<t3)  is  modified,  holding 
others  constant  at  2.0,  Figure  24  shows  the  dependence.  The  optimal  schedule  is 
relatively  insensitive  to  <r3  at  smaller  values  but  eventually  the  third  and  subsequent 
customers  tend  to  some  limiting  optimal  schedule.  That  schedule  has  the  third  and 
fourth  customers  arrive  simultaneously;  the  advantage  accrued  when  the  service  of 
the  third  customer  is  small  outweighs  the  disadvantage  when  it  is  large,  so  an  increase 
in  variability  drives  the  customers  closer.  The  cost  at  small  a  is  again  linear  in  a 
(slope  of  1.46  and  correlation  coefficient  of  0.999).  As  a  increases  sufficiently,  the  cost 
approaches  that  of  the  limiting  4-customer  schedule  obtained  by  removing  customer 
3  (0.78),  plus  3.0.  If  it  were  certain  that  the  third  customer’s  service  would  be 
sufficient  to  cause  each  subsequent  customer  to  wait,  insertion  of  the  third  customer 
just  before  the  fourth  would  add  six  units  of  cost.  The  probability  of  this  occurrence 
is  close  to  50%,  which  results  in  the  3.0  added  units  of  cost. 

D.5  Dependence  of  Optimum  on  Service  Distribution  Skewness 

When  c  =  1.0,  the  optimal  schedule  and  cost  are  quite  insensitive  to  the  third 
moment.  Define  7  as  the  skewness  of  the  distribution  for  this  section  only.  As 
7  is  varied  over  its  entire  possible  range  (c/.  Section  F.2)  while  holding  mean  and 
variance  constant  at  1.0  and  2.0,  only  a  6.5%  change  in  cost  is  observed.  The  optimal 
scheduled  arrival  time  for  each  customer  varies  at  most  25%  of  the  service  mean  over 
this  range. 

Figures  25  and  26  show  the  dependence  of  optimal  cost  and  schedule  on  service 
distribution  skewness  when  c  has  been  increased  to  1.73.  These  plots  exemplify  the 
dependence  on  7  for  higher  coefficients  of  variance.  Moments  were  matched  using 
a  Cox-plus-Erlang-r  distribution,  as  described  in  Section  F.9,  and  r  was  arbitrarily 
limited  to  16.  For  this  reason,  the  smallest  7  obtainable  was  0.22,  which  is  the 
minimum  value  displayed  on  the  plots. 
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As  all  customers’  skewnesses  are  increased  from  zero  together,  the  optimal 
schedule  approaches  some  limit.  Above  7  =  0-8,  the  schedule  is  quite  insensitive  to 

7- 

As  the  skewness  increases,  the  cost  decreases.  This  can  be  understood  by  noting 
that,  in  order  to  increase  the  skewness  of  a  given  distribution  while  maintaining  the 
same  mean  and  variance,  it  is  necessary  not  only  to  shift  a  portion  of  the  probability 
mass  farther  to  the  tail.  It  is  also  necessary  to  shift  a  larger  portion  of  the  probability 
mass  lower.  This  second  action  dominates  the  first  in  the  outcomes  illustrated. 

The  cost  approaches  3.77  as  the  skewness  is  decreased  to  zero  and  7.16  as 
the  skewness  increases  without  bound.  These  values  were  determined  by  a  least- 
squares  approach  to  fitting  the  lower  data  to  C  =  ax  +  a2 703  and  the  upper  data 
to  C  =  ai  +  a2exp(a3),  and  the  solid  curves  in  the  cost  plot  show  these  fits.  An 
analytical  explanation  of  these  limits  is  not  apparent. 

Figure  26  shows  the  dependence  of  the  optimum  on  a  single  service  skewness. 
When  the  third  skewness  is  near  zero,  there  is  a  possibility  that  the  service  will 
be  very  small.  As  a  result,  the  fourth  customer  s  optimal  arrival  time  converges  to 
that  of  the  third.  Again,  for  skewnesses  over  0.8,  the  schedule  is  quite  insensitive  to 

changes  in  skewness. 
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cost  scheduled  arrival  times 


show  probability  of  third  customer 


Figure  20.  Optimal  schedule  and  cost  dependence  on  show  probability  of  a  single 
customer.  Here,  73  is  varied,  with  all  other  show  probabilities  fixed  at 
unity.  _ 
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Figure  23.  Optimal  schedule  and  cost  dependence  on  the  service  standard  devia¬ 
tion  of  all  customers.  The  same  standard  deviation  is  shared  by  each 
customer. 
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standard  deviation  of  third  customer’s  service 

Figure  24.  Optimal  schedule  and  cost  dependence  on  the  service  standard  devi¬ 
ation  of  a  single  customer.  Here,  the  standard  deviation  of  the  third 
customer’s  service  is  varied  while  the  others  are  held  constant. 


143 


log(cost) 


skewness  of  each  customer’s  service  distribution 


0.55-1 - 1 - 1 - 1 - 1 - 1 

-0.66  -0.5  0  0.5  1 

log(skewness  of  each  customer’s  service  distribution) 

Figure  25.  Optimal  schedule  and  cost  dependence  on  the  service  skewness  of  all 
customers.  The  same  skewness  is  shared  by  each  customer,  and  c  = 
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Appendix  E.  Medical  Scheduling  Example 


This  appendix  describes  a  study  performed  in  1996-97  by  the  author  in  con¬ 
junction  with  the  Primary  Care  Clinic  at  Wright-Patterson  AFB,  OH.  Appoint¬ 
ment  time  data  were  collected  over  several  months  for  146  patients  of  a  particular 
doctor.1  From  these  data,  appointment  frequency,  show  rates,  and  service  distri¬ 
butions  were  estimated  for  various  classes  of  patients.  Estimates  of  current  cost 
and  potential  improvement  from  sequence  and  schedule  optimization  were  then  de¬ 
termined.  The  purpose  was  a  preliminary  determination  of  whether  sequence  and 
schedule  optimization  were  appropriate  and  of  value  for  this  clinic. 

E.  1  Data 

On  a  typical  day,  the  doctor  in  this  study  would  see  5  to  7  patients  in  the 
morning  or  afternoon  over  a  period  of  approximately  170  minutes  (horizon).  Ap¬ 
pointments  were  made  by  a  clerk  with  no  medical  training,  with  slots  being  either 
20  or  30  minutes  in  length,  depending  on  whether  the  patient  had  been  seen  before. 
Classification  of  patients  at  the  time  of  the  appointment  was  based  solely  on  whether 
the  patient  was  a  military  dependent,  retired  military,  or  military  on  active  duty, 
and  all  patients  in  this  study  were  retired  military  or  aged  dependents.  The  average 
patient  waiting  time  was  about  7  minutes. 

The  doctor  recorded  the  following  data  for  146  patients  over  22  days:  Appoint¬ 
ment  time,  starting  and  ending  times  of  the  doctor’s  service,  whether  the  patient 
showed,  and  the  patient’s  classification.  Data  collection  was  stopped  when  the  doctor 
was  given  orders  to  a  new  base.  Although  he  continued  to  see  patients  for  several 
months  after  this  time,  patient  service  time  after  this  point  was  increased  as  the 

'Dr  Charles  Beleny,  Lt  Col,  USAF,  Chief  of  Wright-Patterson  Air  Force  Base  Primary  Care 
Clinic,  provided  access  to  clinic  personnel  and  gave  the  author  invaluable  scheduling  advice.  Dr 
Robert  Nardino,  Maj,  USAF,  a  physician  in  the  primary  care  clinic  at  the  time,  collected  the  data 
using  his  own  patients. 
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doctor  sought  to  arrange  follow-on  care  upon  his  departure.  It  was  judged  that  the 
later  data  were  not  usable. 

The  doctor  was  asked  to  develop  a  patient  classification  scheme  that  could 
predict  service  time  more  accurately  than  the  aggregate  mean.  Such  a  scheme  was 
to  be  easy  for  the  scheduler  to  use  to  classify  patients  over  the  phone.  The  scheme 
he  settled  on  was  to  record  the  number  of  new  complaints  and  the  number  of  chronic 
complaints  (complaints  previously  treated  by  this  doctor)  the  patient  requested  med¬ 
ical  care  for  at  the  time  he/she  made  the  appointment.  The  paucity  of  the  data  in 
this  preliminary  study  necessitated  the  aggregation  of  the  eight  categories  originally 
envisioned  into  the  following  three: 

-  Class  A:  fewer  than  three  chronic  complaints,  no  new  ones  (33  data  points). 

-  Class  B:  more  than  three  chronic  complaints,  no  new  ones  (50  data  points). 

-  Class  C:  at  least  one  new  complaint  (56  data  points). 

One  might  classify  this  strategy  of  dividing  the  customers  into  classes  as  a 
variance  reduction  technique.  By  recognizing  different  classes,  perhaps  one  could 
take  advantage  of  additional  information  to  establish  more  certainty  in  the  service 
time  of  a  customer.  The  second  point  is  certainly  valid,  but  this  example  suggests 
that  the  information  gained  is  not  contained  in  the  variance,  as  some  researchers 
have  asserted  or  conjectured  [31,  164].  The  variance  of  the  full  sample  is  77.1,  while 
the  variances  for  classes  A,  B,  and  C  customers  are  16.0,  106.7,  and  66.7.  Variance 
is  not  reduced  enough  to  warrant  major  improvements  in  prediction.  One  should 
not  think  of  this  approach  as  a  method  of  establishing  customer  service  times  more 
certainly,  but  rather  as  a  method  of  taking  into  account  the  service  time  distribution 
information  in  some  more  complex  way. 

Histograms  of  the  observed  service  times  for  these  classes  are  shown  in  Figure  27. 
These  histograms  do  not  include  no-shows. 
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E.2  Assumptions 

A  number  of  assumptions  were  made  as  the  study  progressed.  Some  are  listed 

here: 

-  The  doctor’s  scheduling  horizon  is  exactly  170  minutes. 

-  Six  patients  require  scheduling  each  day. 

-  Patients  cannot  be  scheduled  more  accurately  than  10  minutes.  Thus,  18 
schedule  slots  are  adequate. 

-  The  unit  value  of  each  patient’s  time  is  the  same,  and  the  doctor’s  time  is  three 
times  more  valuable  than  that  of  the  patients. 

-  The  doctor  is  able  to  utilize  idle  time  during  the  schedule  horizon  productively, 
so  the  value  assigned  to  his  time  during  the  scheduling  period  is  constant.  Only 
his  time  spent  after  the  end  of  the  scheduling  period  is  assigned  a  cost  (r„  =  rh). 

The  doctor  agreed  these  were  reasonable  assumptions,  although  it  appeared  difficult 
to  ascertain  a  good  value  for  the  relative  unit  costs  of  patient  and  doctor  time. 


E.3  Analysis  and  Results 

Sample  statistics  for  the  three  service  classes  are  shown  in  Table  18. 


class 

mean 

c2 

skewness 

show  rate 

A 

18.75 

0.046 

0.46 

0.89 

B 

21.94 

0.139 

2.51 

0.95 

C 

25.56 

0.141 

0.28 

0.92 

The  best  fits  to  the  service  distributions  were  found  via  a  commercial  software  pack 
age  to  be  truncated  Erlang  and  truncated  gamma  distributions.  However,  because 
the  coefficients  of  variation  are  so  low,  it  was  decided  that  matching  the  first  two 
moments  was  sufficiently  accurate.  The  Cox-plus-Erlang-r  distribution  described  in 
Appendix  F  was  fit  to  the  data.  Because  the  coefficient  of  variation  is  less  than 
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1,  this  distribution  is  equivalent  to  a  generalized  Erlang  distribution.  The  Coxian 
parameters  in  Table  19  were  determined. 


Table  19.  Service  distribution  approximations  for  the  medical  example 


class 

Erlang 

phases 

Coxian 
phase  rate 

Erlang 
phase  rates 

Cox 

transition  probability 

A 

22 

1.173 

1.173 

0.9994 

B 

8 

0.360 

0.360 

0.9842 

C 

8 

0.308 

0.308 

0.9820 

Using  these  service  distribution  approximations,  the  optimization  programs  in 
Appendix  H  were  used  to  obtain  the  optimal  sequence  and  schedule  for  each  combi¬ 
nation  of  patients  that  might  be  scheduled  on  a  given  day,  under  the  current  policy. 
There  are  28  possible  combinations  with  3  classes  and  6  customers.  Exhaustive 
enumeration  of  the  36  =  729  possible  permutations  of  these  combinations  was  per¬ 
formed,  and  the  optimization  of  each  permutation  took  between  6  and  10  minutes 
on  a  133MHz  Pentium  processor.  Table  20  shows  the  results. 

The  greedy  sequencing  algorithm  proposed  in  Chapter  V  was  also  employed 
to  find  the  optimum  for  each  combination  above.  For  all  but  two  combinations,  the 
global  optimum  was  obtained,  regardless  of  the  starting  sequence  used  for  the  greedy 
algorithm.  In  the  case  of  sequence  AAABCC,  of  the  60  possible  starting  sequences, 
16  yielded  a  suboptimal  sequence  with  a  cost  that  exceeded  the  optimum  by  0.02 
(0.3%),  and  14  others  yielded  a  suboptimal  sequence  that  exceeded  the  optimum 
by  0.006  (0.09%>).  In  the  case  of  sequence  AAAABC,  7  of  the  15  possible  starting 
sequences  yielded  a  suboptimal  sequence  that  exceeded  the  optimum  by  0.11  (3.1%). 
However,  although  these  errors  may  be  significant,  neither  of  the  combinations  turned 
out  to  play  a  role  in  the  solution  to  be  proposed. 

The  relative  frequencies  of  classes  A,  B,  and  C  in  the  data  sample  are  0.237, 
0.360,  and  0.403,  respectively.  If  these  frequencies  hold  in  the  long  run,  a  reasonable 
appointment  strategy  would  be  to  employ  only  some  subset  of  the  optimal  sequences 
and  ensure  that  subset  allows  the  scheduler  to  cope  with  small  variations  from  the 
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Table  20.  Optima  for  each  combination  for  the  medical  scheduling  problem.  The 
possible  combinations  for  this  problem,  their  frequency  of  occurrence 
under  random  selection,  and  the  optimum  sequence  and  schedule,  given 
that  combination. 


random 

optimum 

combination 

frequency 

permutation 

schedule 

cost 

ABBCCC 

0.1206 

ACCCBB 

0  20  50  80  110  140 

16.94 

ABBBCC 

0.1077 

ACBCBB 

0  20  50  80  110  140 

13.86 

AABBCC 

0.1064 

AABBCC 

0  20  40  70  100  130 

9.65 

AABCCC 

0.0794 

AACCBC 

0  20  40  70  100  130 

12.01 

ABCCCC 

0.0675 

ACCCCB 

0  20  50  80  110  140 

20.48 

AABBBC 

0.0634 

AABBBC 

0  20  40  70  100  130 

7.57 

BBBCCC 

0.0611 

BBCCCB 

0  20  50  80  110  140 

23.44 

BBCCCC 

0.0513 

BCCCCB 

0  20  50  80  110  140 

27.40 

ABBBBC 

0.0481 

ACBBBB 

0  20  50  80  110  140 

11.24 

AAABCC 

0.0467 

AABCAC 

0  20  40  70  110  130 

6.95 

AAABBC 

0.0417 

ABBCAA 

0  20  50  80  120  140 

5.16 

BBBBCC 

0.0409 

BBCCBB 

0  20  50  80  110  140 

19.89 

BCCCCC 

0.0230 

BCCCCC 

0  20  50  80  110  140 

31.88 

AACCCC 

0.0222 

AACCCC 

0  20  40  70  100  130 

14.84 

AAACCC 

0.0174 

AACCAC 

0  20  40  70  110  130 

8.82 

ACCCCC 

0.0151 

ACCCCC 

0  20  50  80  110  140 

24.90 

BBBBBC 

0.0146 

BBBCBB 

0  20  50  80  110  140 

16.82 

AABBBB 

0.0142 

ABBBBA 

0  20  50  80  110  140 

5.91 

AAAABC 

0.0137 

ACBAAA 

0  20  60  90  120  140 

3.53 

AAABBB 

0.0124 

BBBAAA 

0  30  60  90  120  140 

4.21 

ABBBBB 

0.0086 

ABBBBB 

0  20  50  80  110  140 

9.06 

AAAACC 

0.0077 

ACACAA 

0  20  60  80  120  140 

4.65 

AAAABB 

0.0061 

BABAAA 

0  30  60  90  120  140 

2.79 

CCCCCC 

0.0043 

CCCCCC 

0  20  50  80  110  140 

38.60 

BBBBBB 

0.0022 

BBBBBB 

0  20  50  80  110  140 

14.18 

AAAAAC 

0.0018 

AACAAA 

0  20  50  90  120  140 

2.39 

AAAAAB 

0.0016 

BAAAAA 

0  30  60  90  110  140 

1.79 

AAAAAA 

0.0002 

AAAAAA 

0  20  50  80  110  140 

0.96 
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expected  relative  frequencies  of  A,  B,  and  C  requiring  appointments.  The  scheduler 
would  choose  the  schedule  for  a  day  from  this  subset  on  the  basis  of  the  relative 
frequency  of  open  appointments  on  days  for  which  the  sequence  is  already  chosen. 
For  example,  if  the  open  A  slots  exceed  0.237  of  the  total  open  slots  for  the  days 
on  which  the  sequence  is  already  fixed,  then  the  decision  should  be  to  fix  the  next 
undecided  day  at  some  sequence  with  few  slots  of  class  A. 

Assume  for  the  moment  that  short-term  variations  in  the  relative  frequency  of 
appointment  demands  by  the  three  customer  classes  can  be  handled.  The  problem  is 
then:  What  relative  frequencies  should  the  above  sequences  be  selected  in  to  attain 
the  required  long-term  relative  frequencies  of  A,  B,  and  C  so  that  the  expected  cost 
is  minimized?  This  problem  is  in  the  form  of  a  linear  program,  in  which  the  relative 
probabilities  are  the  decision  variables.  Four  constraints  are  imposed,  all  decision 
variables  are  nonnegative;  the  decision  variables  must  sum  to  1.0;  and  the  relative 
expected  frequencies  of  A,  B,  and  C  must  be  in  the  ratios  of  0.237  :  0.360  :  0.403. 
The  solution  has  an  expected  cost  of  13.07  and  requires  the  use  of  only  the  three 
sequences  shown  in  Table  21.  This  strategy  of  only  permitting  these  three  sequences 

Table  21.  Optimal  sequence  set  for  the  medical  study.  Because  of  variations  in 
the  proportion  of  requests  from  each  class,  this  set  can  only  support  the 
stated  goals  79%  of  the  time.  _ _ 


sequence 

schedule 

cost 

theoretical 

frequency 

observed 

frequency 

ACBCBB 

0  20  50  80  110  140 

13.86 

0.5784 

0.648 

AACCBC 

0  20  40  70  100  130 

12.01 

0.4200 

0.344 

ACBBBB 

0  20  50  80  110  140 

11.24 

0.0012 

0.008 

allows  variations  in  the  relative  frequencies  of  A  G  [|,  |],B  G  [|,  §],  C  G  [|,  \}-  The 
actual  variations  encountered  in  the  relative  frequencies  of  requesting  customers 
may  be  great  enough  to  cause  two  problems.  The  first  is  that  some  schedule  slots 
go  unfilled,  for  lack  of  customers  of  the  correct  type  making  requests  in  time  to  use 
the  slots.  The  second  is  that  a  large  number  of  customers  of  a  single  type  making 
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requests  in  a  small  time  period  may  cause  an  unacceptably  long  delay  between  a 
request  for  the  appointment  and  the  appointment  itself  (hereafter  simply  called  the 
delay). 

To  approach  these  problems,  a  simulation  of  the  appointment  system  was  de¬ 
veloped  using  FORTRAN  90.  A  simple  set  of  rules  determined  the  choice  of  combi¬ 
nation  for  the  next  schedule.  In  a  series  of  trials  of  100  days  of  operation,  the  cost  of 
the  above  strategy  was  13.52  with  a  standard  deviation  of  0.02,  slightly  higher  than 
the  theoretical  cost.  As  seen  in  Table  21,  the  frequencies  of  the  three  combinations 
were  not  close  to  the  theoretical  values,  either.  The  discrepancies  stem  from  three 
main  sources.  The  first  is  that  no  value  was  placed  on  unfilled  appointments.  There 
were  on  average  8.0,  11.4,  and  5.5  unfilled  slots  of  types  A,  B,  and  C,  respectively. 
These  were  not  accounted  for  theoretically.  A  second  source  of  discrepancy  stems 
from  the  fact  that  theoretical  solution  is  steady-state.  This  simulation  was  run  for 
only  100  days,  retaining  the  initial  scheduling  period  in  the  tally.  When  the  simula¬ 
tion  was  run  for  900  days,  the  average  cost  dropped  to  12.8.  Last,  the  simulation  s 
rules  for  selecting  the  combination  for  a  given  day  were  not  optimal.  Given  these 
differences,  the  agreement  between  theory  and  simulation  is  good. 

The  doctor  desired  that  the  maximum  delay  between  request  and  appointment 
be  one  week  (5  scheduling  periods).  The  simulation  revealed  that  an  average  of 
14.0%  of  the  customers  exceeded  this  limit,  with  the  longest  delay  observed  being  11 
scheduling  periods.  Assuming  this  delay  is  deemed  excessive,  a  reasonable  solution 
might  be  to  retain  the  three  combinations  in  the  solution  above,  but  also  to  employ 
the  sequences  BBBAAA  (cost  of  4.21)  and  CCCCCC  (cost  of  38.6)  when  necessary. 
Table  22  summarizes  this  strategy.  Using  a  similar  rule  base  to  that  above,  and 
applying  the  same  set  of  random  number  streams,  the  simulation  yielded  a  cost  of 
13.9,  with  a  standard  deviation  of  0.06.  An  average  of  2%  of  the  customers  suffered 
delays  of  over  a  week,  with  the  maximum  delay  observed  being  7  scheduling  peri¬ 
ods.  Thus,  the  goal  of  reducing  delays  can  be  nearly  met  with  the  5-combination 
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strategy,  with  only  a  3%  increase  in  cost  over  the  3-combination  strategy.  Fur¬ 
ther  reductions  in  delays  are  possible  with  the  inclusion  of  a  few  more  possible 
combinations.  What  is  the  expected  cost  of  the  current  approach,  which  entails 


Table  22.  The  98%  solution  for  the  medical  study.  Extending  the  number  of  fea¬ 
sible  combinations  to  5  increases  the  cost  3%  but  reduces  the  number 
of  unacceptable  delays  from  14%  to  2%.  The  frequencies  noted  are  the 
relative  frequencies  observed  over  ten  runs  of  the  simulation  of  a  100-day 
schedule. 


sequence 

schedule 

cost 

frequency 

ACBCBB 

0  20  50  80  110  140 

13.86 

0.547 

AACCBC 

0  20  40  70  100  130 

12.01 

0.314 

ACBBBB 

0  20  50  80  110  140 

11.24 

0.020 

BBBAAA 

0  30  60  90  120  140 

4.21 

0.066 

CCCCCC 

0  20  50  80  110  140 

38.6 

0.053 

scheduling  customers  into  preset  slots  of  approximately  20  minutes  without  regard 
to  the  classes  defined  here?  A  good  estimate  of  the  cost  of  this  current  system  would 
require  determining  the  cost  of  each  permutation  of  the  22  possible  combinations, 
for  each  of  the  16  different  schedules  observed  in  the  data.  This  requires  excessive 
calculation,  and  it  was  judged  that  sufficient  accuracy  could  be  obtained  for  the 
purpose  by  considering  only  the  3  most  frequently  observed  6-customer  schedules 
and  the  permutations  of  only  the  8  most  probable  combinations  of  classes.  Assume 
the  schedule  is  one  of  [  0  20  40  70  90  110  ,[  0  40  60  110  130  150  ],  or 
[  0  60  80  110  130  150  ]•  These  were  the  most  common  schedules  observed, 
each  accounting  for  15%  of  the  total.  Assume  the  combination  for  each  day  is  one  of 
ABBCCC,  BBCCCC,  BBBCCC,  AABBBC,  ABCCCC,  AABCCC,  AABBCC,  ABB- 
BCC,  or  ABBCCC.  These  8  combinations  will  account  for  70%  of  the  total,  if  the 
relative  frequencies  of  classes  A,  B,  and  C  are  those  observed  in  the  data  set. 

If  the  sequence  of  customers  is  chosen  at  random  from  the  above  subset  of 
8,  the  expected  costs  of  the  three  schedules  are  found  to  be  36.4,  49.5,  and  60.5, 
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respectively,  for  an  average  of  48.8.  This  is  taken  here  as  the  current  daily  cost  of 
operation. 

In  real  terms,  this  suggests  that  the  equivalent  of  48.8  minutes  of  productive 
time  is  lost,  either  through  the  doctor  having  to  spend  extra  time  at  the  end  of  the 
scheduling  period  (time  that  is  weighted  by  a  factor  of  3)  or  through  patients  having 
to  wait.  The  potential  improvement  attained  by  sequence  and  schedule  optimization 

is  thus  67%. 

While  the  patients  in  this  particular  study  were  all  military  retirees,  it  is  in¬ 
structive  to  obtain  a  rough  estimate  of  savings  if  the  proposed  scheme  were  imple¬ 
mented,  if  the  patients  had  been  active-duty  U.S.  Air  Force  servicemembers.  The 
average  hourly  cost  to  the  government  of  a  servicemember’s  time  is  $21. 13, 2  and 
the  doctor  sees  patients  for  about  260  days  each  year.  The  yearly  savings  for  that 
practice  alone  is  thus  $21.13  •  48.8/60  ■  260  •  0.67  «  $3000.  This  particular  practice 
had  a  comparatively  low  waiting  time  per  patient  to  begin  with,  so  this  is  a  con¬ 
servative  estimate  of  the  potential  savings  that  could  be  accrued  from  implementing 
such  a  scheme  for  each  of  the  other  outpatient  practices  in  this  hospital.  It  should  be 
emphasized  that  this  is  the  savings  to  the  government  as  a  whole;  since  the  hospital 
would  benefit  financially  from  only  the  reduction  in  doctor  overtime,  and  since  even 
that  savings  might  not  have  direct  financial  impact,  there  might  be  less  motivation 
to  adopt  the  new  scheme. 

It  is  also  instructive  to  consider  the  relative  improvements  when  the  schedule 
and  sequence  are  optimized.  Suppose  some  better  choice  of  schedule  is  imposed  each 
day  while  still  ignoring  the  class  of  the  patient.  Here,  [  0  20  50  80  100  130  ] 
was  chosen  as  a  reasonable  schedule.  It  differs  from  each  of  the  optimal  schedules  in 
Table  20  by  at  most  10  minutes  for  each  customer  arrival.  The  expected  daily  cost 

2This  average  hourly  cost  for  a  USAF  servicemember  is  derived  from  the  pay  rates  in  Air  Force 
Instruction  65-503,  Attachment  20,  for  FY96  (fiscal  year  1996),  dated  22  March  1996.  These  pay 
rates  were  prorated  using  the  force  strengths  for  FY96  cited  in  tables  in  Air  Force  Magazine,  May 
1996,  p41.  This  calculation  includes  the  cost  to  the  Department  of  Defense  of  most  benefits. 
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under  this  scheme  is  22.2.  This  represents  an  improvement  over  current  operations 
of  55%.  This  improvement  is  commensurate  with  that  found  by  researchers  who 
recommended  appointment  schedule  (but  not  sequence)  optimization  for  specific 
operations  on  the  basis  of  simulations,  such  as  Soriano’s  50%  improvement  [149]  or 
Glendenning’s  25%  improvement  [51]. 

Compare  this  improvement  to  that  attained  when  another  (less  realistic)  policy 
is  chosen.  Suppose  the  sequences  and  schedules  for  each  day  are  chosen  at  random 
from  the  above  possibilities,  but  once  chosen,  the  customers  are  ordered  optimally. 
The  costs  for  the  three  combinations  become  27.8,  34.2,  and  49.0,  respectively,  for 
an  average  of  37.0,  a  decrease  of  24%  from  the  current  cost.  It  has  been  the  case 
in  each  of  a  small  series  of  tests  that  the  improvement  attained  by  optimizing  the 
schedule  is  larger  than  that  attained  by  optimizing  the  sequence  of  customers. 

E.4  Outcome 

The  proposed  policy  was  not  implemented  by  the  clinic,  since  this  was  only 
a  feasibility  study  and  was  specific  to  a  single  physician,  who  had  already  moved 
to  another  job.  The  general  approach  seems  highly  promising  as  a  way  of  reducing 
patient  waiting  time  and  doctor  overtime.  However,  the  proposal  to  expand  the 
study  and  implement  results  on  a  small  sample  met  with  some  reluctance  from  the 
clinic  for  several  reasons: 

-  The  clinic  had  just  lost  two  of  its  three  schedulers,  due  to  personnel  cuts,  and 
it  was  deemed  a  poor  time  to  make  any  changes  that  would  complicate  the 
scheduling  process. 

-  Another  recent  study  put  the  current  average  waiting  time  in  the  clinic  at  9 
minutes  per  patient,  and  this  was  deemed  sufficiently  low;  reducing  waiting 
time  was  not  considered  a  high  priority. 
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-  The  doctor  with  whom  the  study  was  performed  has  a  very  specialized  practice, 
and  it  was  not  clear  that  similar  benefits  would  accrue  to  other  practices.  In 
particular,  the  patient  load  was  much  heavier  for  other  doctors. 

_  'j’he  minimum  acceptable  lattice  size  is  10  minutes.  Because  that  is  so  large 
relative  to  the  mean  service  time,  it  was  thought  that  a  modification  of  the 
scheduling  protocol  would  not  achieve  much  improvement.  It  was  suggested 
that  a  surgery  or  other  practice  with  longer  appointments  might  accrue  greater 
benefit. 

-  There  was  already  a  patient  classification  scheme  in  place,  used  for  reasons 
other  than  scheduling  effectiveness.  Any  scheme  that  forced  other  patient 
orderings  would  not  be  acceptable. 

Some  of  these  objections  are  easily  dismissed.  For  instance,  in  the  preliminary 
study  above,  if  decreasing  waiting  time  is  not  a  priority,  waiting  time  and  overtime 
could  be  held  at  the  same  level  and  a  seventh  patient  added  to  the  schedule,  without 
increasing  the  cost.  Alternatively,  the  schedule  horizon  could  be  shortened  by  25 
minutes  without  increasing  the  cost.  In  response  to  concerns  over  the  forced  coarse¬ 
ness  of  the  lattice,  this  did  not  prevent  substantial  improvement  in  the  preliminary 

study. 

It  was  proposed  that,  even  though  sequence  optimization  is  objectionable,  an 
implementation  of  a  different  schedule  could  still  reduce  waiting  time  substantially. 
In  the  above  case,  such  a  strategy  led  to  55%  improvement.  This  point  is  being 
considered,  and  such  a  scheme  could  be  adopted  once  personnel  pressures  ease. 

Other  objections  clearly  cannot  be  argued  away  by  those  outside  the  profes¬ 
sion.  The  additional  burden  on  administrative  staff  of  the  new  protocol  and  the 
sacrosanctity  of  the  current  patient  classification  and  ordering  system  precluded  any 
further  testing  of  the  proposed  sequencing  optimization  in  this  clinic. 
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In  summary,  a  preliminary  examination  of  a  particular  medical  scheduling 
protocol  indicated  that  the  cost  of  operation  could  be  reduced  by  a  factor  of  3, 
where  the  cost  is  defined  as  the  sum  of  the  patient  waiting  times  and  three  times 
the  doctor’s  overtime.  No  further  resources  are  needed  to  achieve  this  improvement. 
Changes  to  the  scheduling  procedure  include  the  scheduler  questioning  the  patient 
as  to  whether  his  or  her  ailments  have  been  attended  to  before  by  this  doctor  and 
the  restriction  of  certain  classes  of  patients  to  certain  scheduling  slots. 

Further  work  on  scheduling  and  sequencing  patients  in  the  clinic  has  not  been 
pursued  further  (although  the  hospital  has  expressed  interest  in  the  use  of  the  patient 
classification  scheme  above  as  a  way  of  reducing  service  time  variance,  in  a  manner 
similar  to  the  scheme  proposed  by  Davis  and  Reed  in  the  context  of  operating  room 
scheduling  [31])-  The  main  reasons  for  reluctance,  in  the  eyes  of  this  researcher,  are 
personnel  cuts  in  the  scheduling  area  and  a  general  suspicion  of  the  practicality  of 
such  nonintuitive  results. 

However,  this  preliminary  study  served  its  purpose  well.  It  shows  the  poten¬ 
tial  of  appointment  schedule/sequence/combination  optimization  is  substantial  and 
provides  a  model  for  future  studies  which  can  be  employed  or  improved  on. 
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Appendix  F.  Matching  Moments  with  Coxian  Distributions 

This  appendix  examines  approaches  to  obtaining  parameters  for  a  Coxian  dis¬ 
tribution  with  a  given  set  of  moments.  In  the  process,  several  apparently  new  results 
are  obtained: 

1.  Necessary  and  sufficient  bounds  on  feasible  moments  of  Coxian-2  distributions, 

2.  A  convenient  framework  for  examining  moment  bounds, 

3.  An  analysis  of  the  feasible  3-moment  space  of  the  distribution  defined  by  a 
Coxian  phase  appended  to  an  Erlang  distribution, 

4.  A  recursive  approach  to  determining  moments,  and 

5.  A  graphical  approach  to  determining  equivalent  representations  of  a  phase-type 
distribution 

It  has  been  shown  by  others  that  phase-type  distributions,  in  particular  Cox¬ 
ian  distributions,  can  provide  arbitrarily  accurate  representations  of  any  desired 
distribution  with  positive  support  (i.e.,  for  which  each  negative  value  has  zero  prob¬ 
ability)  [46,  73].  It  remains  to  show  methods  of  obtaining  parameters  (number  of 
stages,  rates,  and  transition  probabilities)  of  the  phase-type  approximation  desired. 
Usually,  one  is  given  empirical  data  representing  the  distribution.  In  this  case,  the 
EMPHT  programs  developed  by  Haggstrom,  Asmussen,  and  Nerman  [58]  or  the  EM 
and  ME  programs  developed  by  Asmussen,  Nerman,  and  Olsson  [5,  6]  can  provide  the 
required  parameters  for  a  Coxian  approximation.  Alternatively,  tools  such  as  John¬ 
son’s  MEFIT  (Mixed  Erlang  Fit)  can  be  used  to  fit  a  mixture  of  Erlang  distributions, 
which  can  then  be  transformed  to  a  Coxian  distribution,  as  will  be  shown  [72,  77], 
These  and  similar  tools  are  surveyed  by  Lang  [89,  90], 

In  some  situations,  raw  empirical  data  are  unavailable,  and  only  several  mo¬ 
ments  are  provided.  This  is  often  the  case  when  one  is  trying  to  reproduce  or  expand 
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on  others’  earlier  results.  Moment  matching  is  also  a  convenient  tool  for  examining 
the  sensitivity  of  measures  of  merit  to  higher  moments  of  the  distribution. 

Theoretically,  moment-matching  may  provide  an  accurate  representation  of 
the  distribution.  Let  <t>k  be  the  kth  noncentral  moment.  Cramer  proved  that  if  the 
series  ££T0  <t>kCr /k\  is  absolutely  convergent  for  some  positive  value  of  c,  then  the 
distribution  is  uniquely  determined  by  all  of  its  moments  [28].  Wilks  proved  that  the 
distribution  is  uniquely  determined  by  its  moments  if  the  support  (the  set  of  points 
for  which  the  probability  is  nonzero)  is  bounded  [170].  Practically,  however,  it  is 
impossible  to  determine  whether  Cramer’s  condition  holds,  and  Wilks’s  condition 
may  not  hold  for  service  distributions  of  interest  here.  Whenever  possible,  one 
should  match  the  CDF  of  a  distribution  rather  than  its  moments. 

The  major  goal  of  this  appendix  is  to  examine  possible  approaches  to  deter¬ 
mining  a  phase-type  distribution  in  which  the  first  three  moments  match  those  of 
a  given  distribution.  The  reason  for  this  choice  is  concern  over  whether  the  third 
moment  of  the  service  distribution  strongly  affects  the  optimal  cost  or  schedule  of 
an  appointment  system.  While  typical  measures  of  merit  in  steady-state  queue¬ 
ing  systems  are  relatively  insensitive  to  service  distribution  moments  higher  than 
the  second,  some  researchers  have  noted  sensitivity  to  the  third  moment  when  the 
coefficient  of  variation  is  greater  than  one  [3,  169] . 

Once  tools  for  matching  moments  of  a  given  phase-type  distribution  are  de¬ 
veloped,  they  can  be  used  to  identify  situations  in  which  an  appointment  system’s 
optimal  cost  and  schedule  may  be  sensitive  to  variations  in  the  third  moment.  If 
the  third  moment  proves  to  be  inconsequential,  fewer  phases  should  be  necessary  to 
represent  a  given  distribution,  simplifying  computation. 

Tools  such  as  MEFIT  and  Schmickler’s  MEDA  (Mixed  Erlang  Distributions 
for  Approximation)  are  already  available  to  fit  mixtures  of  Erlang  distributions  to 
the  first  three  moments  [138].  However,  Johnson  and  Taaffe  prove  that  a  mixture  of 
Erlang  distributions  is  not  the  most  parsimonious  representation  of  a  given  moment 
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set,  in  terms  of  number  of  phases  needed  [73].  The  parsimony  issue  is  examined  in 
more  detail  below. 

F.l  Moment  Space  Coordinate  System 

Before  examining  the  problem  further,  a  convenient  coordinate  system  will  be 
described  for  exploring  the  moment  space,  which  consists  of  all  possible  combinations 
of  the  first  three  moments.  Unlike  previous  characterizations  of  this  moment  space, 
the  measures  of  second  and  third  moments  to  be  used  here  will  be  noncentral,  as 
well  as  scaled.  The  coefficient  of  variation  is  defined  by  c  =  where  a2  is  the 

variance.  The  quantity  <f>2  =  (02/0?)  =  c2  +  1  will  be  used  as  a  convenient  measure 
of  the  scaled  second  moment.  Rather  than  skewness,  <f>3  =  fa/<t>\  will  be  used  as  a 
measure  of  the  third  moment.  These  two  measures  will  provide  what  will  be  shown  to 
be  a  natural  coordinate  system  for  the  3-moment  space,  in  which  many  relationships 
may  be  represented  more  simply. 

Given  a  feasible  combination  of  <f>2  and  <f>3,  any  positive  value  of  fa  always  can 
be  attained,  simply  by  redefining  the  time  variable  to  be  the  product  of  the  current 
time  divided  by  the  ratio  of  the  desired  mean  to  the  current  mean.  Since  this 
scaling  does  not  change  the  values  of  the  scaled  noncentral  moments,  the  question 
of  feasibility  is  one  of  characterizing  the  moment  space  in  only  two  dimensions.  In 
particular,  once  the  parameters  are  obtained  to  match  a  phase- type  distribution  to 
the  desired  second  and  third  scaled  noncentral  moments,  the  first  moment  can  be 
matched  simply  by  dividing  each  phase  rate  by  the  above  ratio. 

F.2  General  Feasibility 

What  regions  of  this  moment  space  are  feasible  for  distributions  with  support 
on  3?+?  First,  it  is  trivially  clear  that  the  bounds  fa  >  0  hold  for  k  €  Q+.  Now 
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consider  the  quantity 


fa  =  E[t 3]  =  E[{{t  -  fa)  +  faf]  =  $  +  E[(t  -  <t> i)3]  +  3faa2  (46) 

The  domain  constraints  t  >  0  and  fa  >  0  imply  E[(t  -  <£i)3]  >  ~4>\-  Then  fa  > 
3faa2,  or  3>3  >  3(3>2  - 1),  creating  a  necessary  bound.  Further,  the  skewness,  defined 
by  'y  =  E[(t  -  fa)s]/c3,  is  usually  known  to  be  positive  for  distributions  on  5ft+.  If 
this  is  the  case,  a  tighter  bound  can  be  formed.  Since  E[(t-<f> i)3]  >  0,  E[t3}  >  3fao2, 
which  is  equivalent  to  $3  >  33>2  -  2.  This  bound  is  necessary  and  sufficient  for  any 
distribution  on  5ft+  with  positive  skewness.  Here,  a  necessary  bound  is  one  that  must 
be  met  by  a  set  of  moments  for  feasibility.  A  bound  is  considered  sufficient  if  all 
points  that  meet  that  condition  and,  in  addition,  meet  other  feasibility  bounds,  are 
feasible.  A  necessary  and  sufficient  bound  is  one  that  is  tight  -  be.,  it  delineates  the 
set  of  feasible  and  infeasible  points. 

A  necessary  and  sufficient  limit  for  feasibility  was  obtained  by  Whitt  [169] 
using  Tchebycheff  systems  theory:  fafa  >  <t>\-  Johnson  and  Taaffe  put  this  bound 
in  the  equivalent  form:  7  >  c  -  c"1  [73,  76],  In  the  moment  space  representation 
suggested  above,  it  is  equivalent  to  $3  >  §\-  This  bound  is  always  tighter  than 
$3  >  3(3*2  —  1),  but  the  positive  skewness  bound,  $3  >  33>2  —  2,  is  tighter  when 
$2  <  2  (i.e.,  c  <  1).  These  bounds  are  shown  in  Figure  28. 

F.3  Obtaining  Coxian-r  Moments 

Given  these  conditions  for  feasibility  for  general  distributions,  it  is  natural  to 
consider  similar  bounds  when  the  distribution  is  constrained  to  a  particular  form. 
The  ultimate  goal  is  to  represent  a  given  moment  set  with  the  simplest  Coxian 
distribution  possible.  Here,  the  Coxian  phase  rates,  p.;,  are  limited  to  5ft  ,  transition 
probabilities  are  required  to  be  positive,  and  all  probability  mass  is  assumed  to  be 
concentrated  in  the  first  stage  at  t  =  0.  While  Cox  [25]  did  not  restrict  his  phase 
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representation  in  these  ways  (indeed,  did  not  even  define  it  in  terms  of  the  physical 
representation  in  Figure  3),  these  restrictions  are  in  common  use  today  [3,  77,  117, 
121]. 

In  exploring  the  feasible  moment  space  of  Coxian  distributions,  it  is  necessary 
to  obtain  moments  efficiently.  While  standard  approaches  such  as  differentiation  of 
the  Laplace  transforms  are  available,  the  resulting  expressions  become  unwieldy  very 
quickly.  For  example,  moment  expressions  for  a  Coxian-3  are: 


4>  i  = 
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Clearly,  it  will  be  oppressive  to  obtain  moments  via  Laplace  transforms  or  mo¬ 
ment  generating  functions.  One  alternative  is  to  use  a  matrix-geometric  representa¬ 
tion  [117]: 

=  (-l)kk\T~k^  (50) 


where  T  is  the  state  transition  matrix  and  is  a  column  vector  of  length  r  with 
all  elements  equal  to  unity,  as  discussed  in  Section  3.2.  Another  approach  is  to  gen¬ 
erate  the  moments  recursively  from  the  Laplace  transforms.  Consider  the  recursive 
representation  of  a  Coxian-j  in  Figure  29. 


Let  Fj(s)  be  the  Laplace  transform  of  a  Coxian- j  distribution.  Let  Ff](s)  be 
its  kth  derivative  with  respect  to  s.  Then  0j,*,  =  (— l)fcFj  ^(0)  is  the  kth  moment 
of  a  Coxian  -j.  Note  that  F^s)  =  m/s  +  M,  4%)  =  (-1)**!m/(«  +  M)fc+1,  and 
0i  fc  =  k\/ fik.  Now  suppose  one  wished  to  add  a  phase  to  the  beginning  of  a  Coxian- 
(j-1)  to  create  a  Coxian- j,  with  rate  /j  and  probability  b.  Then  the  following  holds 


164 


for  j  >  1: 


F3(s)  =  (1  —  b)Fi(s)  +  bFi(s)Fj-i(s) 


Fjk\s) 


=  (1  -VF^  +  bUi^t^FU*) 


nW 


2=0 


5  =  0 


Flk](s)  +  b^(i)Frl  (s)Ffi1(s) 
2=1 


5—0  i=0 


i\  /r 


5=0 


tL  +  Fb  y 


(51) 


This  recursive  relation  provides  a  simple  alternative  to  matrix  formulations  when 
programming. 


F.4  Coxian- 2  Feasibility  Bounds  when  c  >  1 

Consider  a  Coxian-2  distribution.  Defining  w  =  [12/ 11 1)  an<^  defining  b  —  b\ 
as  the  transition  probability  from  the  first  to  second  phase,  the  recursive  equation 

above  leads  to 

n! 


Algebraic  manipulation  of  the  first  three  moments  leads  to 


3 
— < 

2 


+ 


3w($2  -  2) 

(w  +  6)2 


(53) 


It  is  clear  that  if  $2  >  2  (i.e.,  if  c  >  1),  a  necessary  condition  for  feasibility  is 
>  1  $2  To  see  this  bound  is  sufficient,  consider  the  following  expression  for  $2, 

o  2  1 

obtained  from  the  moment  expressions: 


$2  =  2  {w2  +  wb  +  b)/(w  +  bf 


(54) 
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In  the  limit  as  w  approaches  zero,  <f>3  approaches  f$2  and  ^2  approaches  2/6,  which 
by  judicious  choice  of  6  can  take  on  any  value  greater  than  2.  The  bound  $3  >  f  $2 
is  therefore  tight  when  $2  >  2.  No  other  bounds  pertain  when  $2  >  2.  These  results 
were  obtained  by  Altiok  by  a  more  involved  argument  [3].  This  bound  is  depicted 

in  Figure  28. 


F.5  Equivalence  of  Phase-Type  Distributions 

Before  proceeding  with  other  bounds  of  the  Coxian-2,  Whitt’s  work  on  the 
hyperexponential  distribution  with  two  phases  (H2)  should  be  noted.  A  hyperexpo¬ 
nential  distribution,  Hr,  is  a  mixture  of  r  exponential  phases.  A  mixture  is  defined 
by  a  set  of  distributions  in  parallel,  with  the  distributions  assigned  mutually  ex¬ 
clusive  and  collectively  exhaustive  branching  probabilities.  Whitt  found  that  the 
feasible  moment  space  for  the  H2  is  delineated  by  $2  >  2  and  $3  >  |$2>  the  same 
bounds  Altiok  later  obtained  [169].  Johnson  and  Taaffe  observed  that  the  reason 
for  the  equivalence  of  these  bounds  is  that  an  H2  is  equivalent  to  a  Coxian-2  with 

Pi  >  P2  [73]. 

This  equivalence  is  most  easily  seen  by  applying  Cumani’s  results  [29].  He 
showed  by  a  simple  algebraic  identity  that  any  phase  could  be  replaced  by  a  mixture 
of  that  phase  and  one  with  a  larger  transition  rate,  without  changing  the  Laplace 
transform  of  the  phase-type  distribution.  Still  letting  w  =  p2/ pi, 


p2 

S  +  P2 


=  w 


Pi 

s  +  Pi 


+  (1  -w) 


P1P2 

(s  +  Pl){s  +  P2) 


(55) 


Figure  30  shows  the  transformation  of  an  H2  distribution  into  an  equivalent  Coxian- 
2.  This  transformation  is  reversible  as  long  as  p\  >  p2 ,  which  is  tantamount  to 
requiring  c  >  1.  This  graphical  approach  to  obtaining  phase-type  equivalence  can  be 
rigorously  supported  and  is  more  intuitive  than  manipulating  Laplace  transforms. 
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Figure  30.  Transformation  of  H2  to  Coxian-2  using  Cumani’s  result  [29] 


F.  6  Coxian-2  Feasibility  Bounds  when  c  <  1 

If  $2  <  2,  then  Equation  (  53)  leads  to  the  necessary  bound  $3  <  f$2.  This 
is  marked  on  Figure  31  as  Altiok’s  bound,  although  Altiok  was  not  concerned  with 
the  case  of  $2  <  2. 

By  further  manipulation  of  the  moment  expressions  for  a  Coxian-2, 


$3  —  6(d>2  —  1)  — 


3(1  -fe)(2  -$2) 
w  +  b 


(56) 


When  $2  <  2,  Equation  (  56)  leads  to  the  necessary  upper  bound  $3  <  6($2  - 1).  To 
see  this  bound  is  also  sufficient,  note  that  it  is  attained  only  when  $2  =  2  or  b  =  1. 
If  b  is  set  to  unity,  $2  =  2  -w/(w  +  l)2,  from  which  it  is  clear  that  w  can  be  chosen 
to  obtain  any  desired  value  of  $2  €  [1.5.  2.0],  Thus  the  bound  $3  <  6($2  -  1)  can  be 
attained  and  cannot  be  surpassed;  it  is  tight.  This  bound  is  depicted  in  Figure  31 
as  the  upper  bound  of  the  Coxian-2  feasible  region. 

One  necessary  lower  bound  to  the  Coxian-2  with  $2  <  2  is  provided  by  the 
general  feasibility  condition,  noted  on  Figure  31  as  the  Tchebycheff  bound.  It  is  also 
referred  to  by  Johnson  and  Taaffe  as  the  Bernoulli  bound,  presumably  since  it  can 
only  be  attained  by  a  generalized  Bernoulli  distribution  [75].  This  bound  can  be 
tightened  by  further  algebraic  manipulation  of  the  moment  expressions  to  obtain 

2 (w3  +  3 b{b2  -3 b  +  3)) 

7= - r- 7 - —7vi -  v°C 

c3(w  +  b y 

Since  the  quadratic  term  is  positive  for  all  values  of  b ,  the  skewness  is  always  positive. 
Another  necessary  lower  bound  is  thus  the  positive  skewness  bound  found  earlier: 
$3  >  3$2  -  2.  This  is  marked  in  Figure  31  as  the  skewness  bound. 
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A  tight  lower  bound  may  be  obtained  by  solving  Equation  (  54)  for  6  and 
substituting  into  Equation  (  56)  to  obtain 


4*3 


6 $!(w3  +  2'U/2  ~  +  w  +  1  +  x(w2  +  W  +  1)) 

(w  +  X  +  l)3 


(58) 


where  x  =  yf{w  +  l)2  -  2$2-  For  a  fixed  $2, 

d$3  _  6 (w  -  1)$%(2  ~  $2)(2w2  +  w  +  l-  $2w2  +  wx  +  x) 
dw  ~  x(w  +  x  +  l)4 


(59) 


This  partial  derivative  disappears  at  w  =  1.  The  last  parenthetical  term  does  not 
disappear  for  $2  €  [1.5,2].  Since 


d2$ 3  _  6(2  -  $2^1(72(5  -  3($2  -  1)*)  +  V2  -  $2(8  -  ($2)  (6Q) 

d™2  w=i  %/2-$2(2  +  ^2(2 -$2))5 


is  positive  for  $2  €  [1.5,2],  w  =  1  represents  the  minimum  $3  for  a  fixed  <f>2. 
Substitution  of  w  =  1  leads  to  the  two  equations 


$3 


$2 


18(6  +  1) 
(b  +  l)3 
2(26  +  1) 
(6  +  1)2 


(61) 


which  combine  to  give  a  tight  lower  bound: 


$3  =  9$2  -  10  +  3V2  (1  -  $2)* 


(62) 


This  bound  is  equally  cumbersome  when  defined  with  respect  to  c: 

$3  =  3  (sc2  -  1  +  ^2(1  -  c2)3)  (63) 
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It  is  depicted  in  Figure  31  as  the  lower  bound  of  the  area  marked  “Coxian-2  feasible 
region” . 

Aldous  and  Shepp  proved  that  the  least  variability  in  a  phase-type  distribution 
is  attained  by  an  Erlang  distribution  [2],  Therefore,  the  bound  $2  >  1-5  (c2  >  0.5) 
also  applies  to  the  Coxian-2.  This  is  marked  in  Figure  31  as  the  Erlang-2  bound.  As 
can  be  seen,  it  is  rendered  redundant  by  the  combination  of  the  above  tight  upper 
and  lower  bounds,  which  meet  at  $2  =  1-5  and  at  $2  =  2  (c2  =  0.5  and  c2  =  1). 

To  summarize  the  Coxian-2  moment  space,  if  c  >  1,  the  bound  $3  >  f$2 
is  necessary  and  sufficient.  If  $2  <  2,  the  two  bounds  $3  <  6($2  -  1)  and  $3  > 

3  ^3<j2  _  4  +  ^2(2  -  $2)3)  are  necessary  and  sufficient. 

The  author  verified  these  bounds  by  means  of  nonlinear  programs.  A  Newton 
search  using  calculated  second  derivatives  was  applied,  as  was  a  conjugate  gradient 
search.  The  objective  function  was  a  weighted  combination  of  the  squared  error  in 
$2  from  the  desired  value  and  the  value  of  $3.  The  weight  on  the  error  in  $2  was  set 
very  large,  so  that  the  search  would  stay  very  close  to  the  desired  $2.  The  weight 
on  <h3  was  set  to  ±1,  depending  on  whether  the  minimum  or  maximum  value  was 
desired.  Several  starting  points  were  applied  to  each  problem. 

The  feasible  area  obtained  in  this  way  appears  to  be  precisely  that  depicted 
in  Figure  31.  Attempts  to  match  moment  pairs  lying  just  beyond  the  theoretical 
bounds  met  with  failure;  the  solutions  thus  obtained  lay  very  close  to,  but  within, 
the  bounds  in  every  trial.  This  supports  the  bounds  proven  above. 

F.7  Matching  Two  Moments 

If  only  the  first  two  moments  are  desired,  the  Coxian-2  is  an  adequate  model 
when  <F2  >  1.5.  When  $2  =  r  +  l/r,  with  r  integral,  an  Erlang-r  suffices.  When 
r  +  \jr  <  <J>2  <  r/r  -  1,  a  mixture  of  an  Erlang-r  and  Erlang-  (r  -  1)  distributions 

suffice.  In  the  pure  Erlang  case,  $3  =  (2c  +  l)(c  +  1)  =  2$2  +  3y/¥2  -  1  “  L  In 
the  Erlang  mixture  case,  the  relation  is  more  complicated,  but  it  is  approximated 
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quite  accurately  with  the  same  expression.  The  resulting  curve  in  3-moment  space 
is  depicted  in  Figure  28. 


F.8  Matching  Three  Moments 

Using  a  Coxian-2  to  match  three  moments  is  problematic;  much  of  the  3- 
moment  space  is  unobtainable  via  a  Coxian-2,  bounds  had  not  been  determined 
until  now  for  <f>2  <  2,  and  no  convenient  algorithm  exists  for  obtaining  the  Coxian 
parameters  from  the  moments  over  much  of  the  moment  space.  Hence,  researchers 
have  turned  to  other  phase-type  distributions  to  match  three  moments.  Johnson 
and  Taaffe  proposed  the  use  of  mixtures  of  two  Erlang-r  distributions  with  distinct 
phase  rates  [73].  They  showed  that,  for  sufficiently  large  r,  any  set  of  feasible  first 
three  moments  is  reachable  (except  for  degenerate  cases  that  are  obtainable  only 
by  concentrating  the  probability  mass  at  one  or  two  points).  They  also  provided 
analytical  formulas  for  determining  the  parameters  required  for  matching  the  first 
three  moments. 

Johnson  and  Taaffe  showed  the  number  of  phases  must  meet  the  conditions 


c  >  -t=  (64) 

sjr 

7  >  — - —  [c-3  +  (1  -  r)e-1  +  (2  +  r)c| 

1  +  r  L  J 


The  expressions  for  these  conditions  are  substantially  simpler  when  expressed  in  the 
($3,  $2)  coordinate  system  (Figure  32): 


$2  > 
$3  ^ 


r  4- 1 
r 

r  +  2 
r  +  1 


<K 


(65) 
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The  minimal  value  of  r  necessary  to  meet  these  equations  is 


rmin  =  int  max 


1, 


*1 


$2  -  1  ’  $3  -  $2 


2  J 


(66) 


Figure  32.  Feasible  3-moment  space  for  a  mixture  of  two  Erlang-/’  distribu¬ 
tions  [73].  It  will  be  shown  in  Section  F.9  that  this  is  also  the  sufficient 
space  of  a  Cox-plus-Erlang-(r)  distribution. 

A  mixture  of  an  Erlang-r(^a)  and  Erlang -r(pp),  na  >  Hp,  is  equivalent  to  at 
least  one  Coxian-2r  [29].  Further,  there  is  only  one  Coxian-2r  with  Ad  >  M2  •  •  •  >  l^2r 
that  represents  the  distribution  [121].  (Cumani’s  result  shows  directly  that  there 
are  an  infinite  number  of  Coxian  representations  if  the  phase  rates  are  not  required 
to  be  ordered.)  This  unique  distribution  is  depicted  in  Figure  33.  It  is  a  Coxian- 
2r  with  parameters  obtained  by  Cumani’s  relation  as  Hi  =  H2  =  •  •  •  =  Hr  =  Ha, 
Hr+i  =  Hr +2  =  .  ■  ■  =  H 2t  =  Hp,  —  b2  =  —  br— i  1,  br  1  p  iv  ,  and 

bj  =  1  -  (1  -  p)wj~r(  1  -  w)2r~j  for  j  =  r  +  l,r  +  2, . . .  2 r.  Again,  w  =  Hi/Hi 

and  Hi  >  H2-  Thus,  the  approach  of  Johnson  and  Taaffe  leads  to  an  analytical 
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procedure  to  obtain  the  parameters  of  a  Coxian-2r  distribution  whose  first  three 


moments  match  any  feasible  set. 


The  mixture  of  two  Erlang-r’s  in  Figure  33  has  precisely  the  same  Laplace 
transform  as  its  Coxian-2r  counterpart,  so  all  moments  of  the  two  match.  If  all 
that  is  required  is  matching  the  first  three  moments,  substantially  fewer  phases  are 
needed.  Johnson  and  Taaffe  proved  that  the  feasible  3-moment  space  of  a  mixture 
of  two  Erlang-r’s  is  precisely  the  same  as  the  feasible  3-moment  space  of  a  mixture 
of  an  Erlang -r  and  an  exponential  [77].  This  in  turn  is  precisely  equivalent  to  either 
the  Coxian- (r  +  1)  in  Figure  34  or  that  in  Figure  35,  depending  on  the  ratio  of  the 
phase  rates.  Note  that  w  =  3  in  the  first  case  and  w  =  pp/pa  in  the  second,  so 

that  it  is  always  true  that  w  G  [0,1]. 

For  the  case  in  which  pQ  >  pp,  the  b}  are  easily  computed  and  are  shown  in 
the  figure.  For  the  case  in  which  pa  <  pp,  it  can  be  proved  recursively  that 

b\  =  pw 

h-  =  P(!  -  +  1  ~  P_  for  2  <  j  <  r  —  3 

3  p(  1  —  w)J_1  +  1  —  p 
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p(  i  -  wy  + 1  —  p 

br-2  =  1  -  P  +  p(l  _  w)3~  1  +  1  -  p 

p(l  -  w)r 

&r_1  _  l-p  +  p(l-w)r^ 


not,  however,  provide  a  method  of  obtaining  the  coefficients  for  the  Coxian-(r  + 
1),  If  analytical  determination  is  required,  it  is  necessary  to  resort  to  the  Coxran- 
2r.  This  author's  attempts  to  solve  the  set  of  algebraic  equations  resulting  from 
matching  moments  with  the  Coxian-(r  +  1)  have  failed.  If  analytical  expressions 
are  not  required,  a  nonlinear  program  (NLP)  can  be  used  to  invert  the  moment 
expressions  and  obtain  the  coefficients. 

F.9  Matching  Three  Moments  with  a  Cox- Plus- Erlang-r  Distribution 

For  any  NLP,  efficiency  will  degrade  rapidly  as  the  number  of  estimated  param¬ 
eters  increases,  so  it  is  essential  to  consider  ways  of  representing  the  3-moment  space 
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that  reduce  the  number  of  parameters  to  be  estimated.  In  addition,  if  a  phase-type 
representation  is  desired,  it  is  usually  desirable  to  minimize  the  number  of  phases 
required.  The  most  efficient  choices  discussed  above  are  twofold.  First,  one  can  rep¬ 
resent  the  moments  with  a  mixture  of  an  Erlang-r  and  an  exponential  distribution, 
then  transform  it  to  a  Coxian- (r  +  1)  distribution,  in  which  case  two  parameters 
must  be  estimated  by  means  of  an  NLP.  Second,  one  can  represent  the  moments 
with  a  mixture  of  two  Erlang-r  distributions,  in  which  case  two  parameters  may  be 
analytically  determined.  This  section  proposes  a  more  efficient  model,  if  c  >  1. 

Since  only  two  of  the  2 r  parameters  of  the  Coxian-(r  + 1)  in  Figures  34  and  35 
are  independent,  it  might  seem  that  one  could  match  three  moments  with  fewer  than 
r  - |-l  phases,  but  experiments  attempting  to  do  so  have  not  been  successful.  An 
NLP  was  constructed  to  determine  the  feasible  3-moment  space  of  Coxian-(r  +  1) 
distributions.  For  a  variety  of  values  of  r,  points  very  close  to  the  conjectured 
boundaries  were  observed,  but  no  points  violating  those  boundaries  were  obtained. 
This  suggests  that  these  conjectured  bounds  are  tight  and  that  the  full  r  +  1  phases 
are  necessary  for  3-moment  matching.  Similar  experiments  have  been  performed  by 
Johnson  and  Taaffe  [75]. 

In  their  NLP  model,  Johnson  and  Taaffe  allowed  all  parameters  of  the  Coxian- 
(r  +  1)  to  vary  [75].  They  found  that,  for  points  close  to  the  boundary,  their  NLP 
solution  strongly  approximated  two  Coxian  stages  followed  by  an  Erlang-(r  —  1). 
(This  is  in  contradistinction  to  the  four-parameter  Coxian  distributions  in  Figures  33 
and  34,  which  are  proved  to  be  capable  of  matching  three  moments.)  Johnson’s  and 
Taaffe’s  result  has  been  reproduced  in  this  effort.  It  may  be  of  some  use  when 
moment  points  are  close  to  the  boundary,  since  the  surface  formed  when  restricting 
the  parameters  to  the  form  in  Figures  34  and  35  creates  local  minima  that  impede 
progress  of  an  NLP.  The  Cox-Cox-Erlang  form  does  not  exhibit  local  minima  close 
to  the  boundary,  which  may  be  the  reason  such  a  solution  is  so  often  obtained  when 
searching  for  a  moment  set  in  this  region  and  allowing  all  parameters  to  vary. 
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Johnson’s  and  Taaffe’s  observation  does  raise  the  question  of  feasible  space 
when  the  model  in  Figure  34  is  further  constrained.  For  instance,  empirical  study- 
led  me  to  consider  a  single  Coxian  stage  followed  by  an  Erlang-r  as  an  effective 
model  when  $2  >  2.  This  is  a  natural  generalization  of  the  Coxian-2,  as  suggested 
by  the  empirically  determined  bounds  in  Figure  36.  The  moment  spaces  of  other 
Cox-plus-Erlang  distributions  exhibit  the  same  general  shape.  This  suggests  the 
following  theorem: 


Theorem  20  The  Cox-plus-Erlang-r  distribution  can  reach  all  3-moment  points  for 
which  $2  >  2  and  $3  >  ^f^2- 

Proof:  The  moment  expressions  for  the  Cox-plus-Erlang-r  distribution  are 


2w2  +  2  brui  +  br2  +  br 
(w  +  br)2 


(68) 
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6w3  +  6  brw2  +  Sbr2w  +  3  brw  +  br 3  +  3  frr2  +  2br 
( w  +  br)3 


(69) 


$3  _ 


As  to  -*  oo,  $2  — *  2  in  a  continuous  fashion.  As  w  — ►  0  and  6  — >  0,  <J>2  00 

continuously,  so  all  values  of  $2  greater  than  2  can  be  reached.  Suppose  <f>2  is  fixed 
at  one  of  these  values  and  that  r  is  also  fixed.  What  are  the  possible  values  $3  may 
take  on? 

Solving  Equation  (  68)  for  b  and  substitution  into  Equation  (  69)  yields  an 
expression  for  <f>3  in  terms  of  w,r,  and  $2: 


£ 

$3 


2 w  +  r  +  1  ±  \J (2w  +  r  +  l)2  —  £w<$>2{r  +  1) 
(r  +  1  )(r  +  3  w  +  2)(£  —  2w$>2)  + 


(70) 


Allowing  w  —>  0  yields 


lim  £ 

w  — >-0 


lim  $3 

w— 


(r  +  1)  ±  (r  +  1) 
+  1  )(r  +  2) 

e 


and  substitution  of  the  two  values  of  £  yields 


oo 

lim  $3  =  < 

r-\- 2  /js2 

,  r+1^2 


£  —  T  ~h  1 
£  =  —  r  —  1 


(71) 


(72) 


Thus,  as  w  —>  0,  it  is  shown  that  both  the  desired  lower  bound  of  <f>3  is  approached 
and  also  that  $3  is  unbounded  above,  depending  on  the  branch  of  f  taken.  If  $3  is 
continuous  with  respect  to  w  for  a  fixed  $2,  then  all  points  between  these  bounds  can 
be  reached  as  well.  The  only  threat  to  this  continuity  lies  in  the  possible  nonexistence 
of  f.  The  situation  may  be  seen  in  Figure  37.  When  w  <  (r  +  l)/(4$2(r  +  l)-2)  = 

£  is  bifurcated,  and  each  branch  is  continuous  with  respect  to  w  over  w  <  r\.  Further, 
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Figure  37.  Constant-w  contours  for  a  Cox-plus-Erlang-2  as  b  is  varied.  All  points 
for  which  c  >  1  and  that  are  reachable  with  a  Cox-(r  + 1)  distribution 
are  also  reachable  with  a  Cox-plus-Erlang-r  distribution. 

they  both  converge  to  the  same  value  at  w  =  t/,  so  £  is  continuous  with  respect  to 
w  over  w  <  r).  Then  $3  is  continuous  over  w  <  rj  as  well,  and  by  the  intermediate 
value  theorem,  it  can  attain  any  value  in  00),  using  some  w  G  [0,  rj\.  - 

In  the  special  case  of  w  =  1,  the  Cox-plus-Erlang-r  is  useful  in  matching  just 
the  first  two  moments  when  l/(r-l)<c2<l.  The  distribution  is  then  equivalent 
to  a  mixture  of  an  exponential  and  an  Erlang-(r  —  1),  which  Tijms  calls  an  E\r-\ 
distribution  [154].  Closed  expressions  for  the  parameters  as  a  function  of  the  desired 
moments  are  easy  to  obtain  in  this  case  . 

Theorem  20  shows  that,  when  c  >  1,  the  Cox-plus-Erlang-r  distribution  can 
reach  any  set  of  three  moments  that  is  (conjectured  to  be)  reachable  by  a  Coxian-(r + 
1)  distribution.  Use  of  this  restricted  Coxian-(r  +  l)  provides  a  fast  and  parsimonious 
approach  to  matching  any  feasible  three  moment  set  when  c  >  1.  Rather  than  a 
search  over  the  2r  -  1  variables  required  in  an  unconstrained  Coxian-r  distribution, 
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the  minimum  value  of  r  is  selected  analytically,  using  Equations  (  66).  When  r  =  1,  a 
Coxian-2  is  required,  and  parameters  can  be  determined  analytically  using  Altiok  s 
approach  [3].  For  r  >  1,  a  Cox-plus-Erlang-r  is  required.  A  line  search  over  w 
is  performed,  employing  Equations  (  68),  (  69),  and  (  71),  after  which  the  other 
parameters  can  be  determined  analytically.  A  Microsoft  Excel™  spreadsheet  is 
provided  in  Appendix  H  that  accepts  a  desired  moment  set  and  provides  Coxian 
parameters  when  c  >  1. 

F.10  Conclusions 

To  summarize  results  in  this  section,  Johnson  and  Taaffe  showed  that  it  is 
possible  to  match  three  moments  with  a  mixture  of  two  Erlang-r  distributions  if 
r  is  large  enough  to  make  the  inequalities  in  Equations  (  66)  true.  Further,  they 
provided  a  method  to  obtain  the  coefficients  of  that  distribution.  This  leads  by 
Cumani’s  result  to  analytical  expressions  for  the  parameters  of  a  Coxian-2r.  Johnson 
and  Taaffe  further  proved  that  r  +  1  Coxian  phases  were  sufficient  to  match  the  first 
three  moments,  but  analytical  expressions  for  the  parameters  have  not  been  obtained, 
nor  has  the  use  of  r  + 1  phases  been  proved  to  be  necessary.  The  parameters  required 
can  be  obtained  easily  using  an  NLP,  however. 

The  bounds  on  the  3-moment  space  of  a  Coxian-2  were  established  for  c  <  1. 
The  Coxian-2  may  be  generalized  to  the  Cox-plus-Erlang-r  distribution,  and  the 
moment  space  of  each  member  of  the  family  has  similar  properties,  the  most  salient 
of  which  is  that  the  feasible  region  when  c  <  1  is  too  small  to  be  of  use  in  moment¬ 
matching  applications. 

However,  when  c  >  1,  the  Cox-plus-Erlang-r  distribution  is  highly  useful  for 
moment-matching.  Tight  bounds  for  the  distribution  were  established,  and  these 
are  precisely  the  conjectured  bounds  for  a  Coxian- (r  +  1)  distribution  when  c  >  1. 
Thus,  the  family  of  Cox-plus-Erlang-r  distributions  cover  the  feasible  3-moment 
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space  for  c  >  1,  providing  a  fast  and  easy  approach  to  3-moment  matching  for 
Coxian  distributions. 

If  c  <  1,  three  moments  are  often  not  required  [3].  If  they  are,  a  reasonable 
approach  to  obtaining  Coxian  parameters  is  to  match  a  mixture  of  Erlang-r  and 
exponential  distributions  using  Johnson’s  and  Taafe’s  empirical  approach  [77],  then 
to  transform  the  result  into  a  Coxian- (r  +  1)  distribution  by  means  of  Cumani  s 
result  [29]. 

Three  other  tools  were  developed.  A  new  coordinate  system  was  proposed 
to  examine  the  moment  space,  one  in  which  bounds  for  Coxian  distributions  are 
straight  lines.  A  recursive  expression  was  devised  to  calculate  moments  of  Coxian 
distributions  efficiently.  A  graphical  approach  to  transforming  phase  distributions 
via  Cumani’s  result  was  employed.  Each  of  these  tools  were  of  use  in  this  research 
and  may  be  of  use  to  others. 
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Appendix  G.  Calculating  the  Exponential  of  a  Matrix 

Section  3.3  developed  an  algorithm  for  the  evaluation  of  the  cost  of  an  ap¬ 
pointment  using  phase-type  approximations  of  service  distributions.  This  algorithm 
depends  on  the  evaluation  of  exp[<3(Tj+i -Tj)],  where  Q  is  an  (N+l)x (N- 1-1)  matrix. 
While  a  number  of  approaches  to  this  calculation  have  been  advanced  in  the  past, 
some  have  poor  accuracy  or  are  inefficient  for  matrices  with  particular  features  [108]. 
The  goal  of  this  appendix  is  to  ascertain  a  method  of  evaluation  appropriate  for  the 
cost  evaluation  algorithm.  In  passing,  several  problems  encountered  in  commercial 
software  packages  will  be  examined  as  well. 

Several  features  of  the  problem  at  hand  are  pertinent. 

-  As  discussed  in  Section  3.3  it  is  desired  to  calculate  an  exponential  only  once  for 
each  appointment  system,  since  this  is  the  most  computationally  intense  part 
of  the  cost  algorithm.  The  exponential  must  be  determined  for  each  arrival, 
but  if  each  tj+i-t3  is  a  multiple  of  some  A,  then  exp(Q(rJ+i-rj))  is  an  integral 
power  of  exp(QA),  making  it  simpler  to  find  each  exponential  once  exp(QA)  is 
determined.  When  arrival  times  are  lattice,  A,  the  smallest  step  size,  is  given. 
In  the  continuous  arrival  time  case,  the  algorithm  in  Section  4.4  discretizes  the 
problem  with  increasingly  small  values  of  A,  and  the  same  approach  holds. 
For  problems  with  nonlattice  arrival  times,  a  lattice  approximation  could  be 
imposed,  making  A  the  greatest  (nearly)  common  integer  divisor  of  r2  -  n, 
r3  - 7i ,  . . .  t„+1  - 7i .  Thus,  a  desirable  feature  of  the  calculation  of  exp(QfcA)  is 
that  the  approach  allow  recalculation  with  different  k  with  minimal  additional 
computation.  At  worst,  one  may  calculate  exp(QA),  then  take  the  resulting 
matrix  to  the  kth  power;  some  methods  (e.g.,  Cayley-Hamilton)  may  involve 
less  computation  than  this. 

-  Since  Q  is  a  probability  matrix,  one  may  first  define  P  as  the  transition  matrix 
obtained  by  eliminating  the  exit  state  from  the  system,  then  calculate  the  NxN 
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matrix  exp(PA).  The  matrix  exp(QA)  can  then  be  formed  by  appending  a  row 
of  N  zeros  to  the  bottom  of  exp(PA),  then  calculating  the  ( N  +  l)st  column 
so  that  the  sum  of  all  rows  is  unity.  This  may  improve  the  efficiency  of  some 
methods  slightly,  since  they  are  operating  on  a  smaller  matrix. 

Q  is  triangular.  For  triangular  matrices,  the  eigenvalues,  required  for  some 
exponentiation  methods,  are  equal  to  the  diagonal  elements.  This  immediate 
access  to  the  precise  eigenvalues  may  lend  some  advantage  to  methods  that 
require  them.  The  further  observation  that  P  often  is  sparse  over  entries  far 
from  the  diagonal  does  not  seem  to  lend  any  computational  advantages  here. 

The  function  exp(Q)  is  well-defined.  If  a  function  of  a  complex  scalar  has  a 
Maclaurin  expansion  that  converges  for  all  arguments  in  an  open  ball  of  some 
radius  about  the  origin,  the  series  also  converges  for  any  argument  that  is  a 
square  matrix  and  for  which  the  largest  eigenvalue  lies  in  the  same  ball  [16]. 
Such  a  function  is  called  “well-defined”  within  that  ball.  It  can  be  shown 
that,  since  Q  is  upper  triangular,  any  well-defined  function  of  Q  is  also  upper 
triangular,  simplifying  calculations  [124]. 

The  application  of  some  of  the  moment-matching  approaches  in  Appendix  F 
can  result  in  phase  rates  that  are  widely  disparate.  This  is  particularly  true 
of  the  Coxian- (r  +  1)  distribution  formed  by  appending  a  Coxian  phase  to  an 
Erlang-r.  Since  these  phase  rates  are  the  negatives  of  the  eigenvalues  (with 
multiplicity),  an  exponentiation  approach  is  required  that  can  handle  widely 
varying  eigenvalues. 

Due  either  to  use  of  moment-matching  approaches  or  to  a  number  of  identical 
customers,  it  may  be  expected  that  a  number  of  phases  are  identical.  Many 
exponentiation  methods  encounter  numerical  instabilities  when  eigenvalues  are 
confluent,  (identical)  or  nearly  so. 
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-  The  problem  size  of  interest  will  be  limited  arbitrarily  to  fewer  than  30  cus¬ 
tomers,  each  represented  by  a  Coxian  distribution  with  3  or  fewer  phases,  so 
the  algorithm  should  be  capable  of  exponentiating  matrices  with  N  <100. 

With  these  features  and  caveats  in  mind,  a  number  of  approaches  were  con¬ 
sidered  .  Three  were  coded  in  FORTRAN-90,  and  some  use  IMSL™  routines.  The 
listings  appear  in  Appendix  H.  Three  routines  provided  in  the  MATLABtm  software 
package  were  considered  as  well.  Several  other  methods  were  examined  theoretically, 
in  light  of  the  success  or  failure  of  the  application  of  those  tested. 


G.l  Maclaurin  Series  Truncation 

The  Maclaurin  series  approach  is  based  on  truncation  of 


exp(.PA)  = 


OO 


£(PA  y/j< 
j~  0 


(73) 


which  is  the  definition  of  the  exponential  of  a  matrix.  (PA)°  is  defined  to  be  the 
identity  matrix  of  appropriate  size  (here,  N  x  N).  This  series  converges  for  all 
arguments. 

Approaches  to  matrix  exponentiation  based  on  truncation  of  the  Maclaurin 
series  have  a  long  history.  Error  analyses,  critical  for  effective  truncation,  were 
provided  by  Standish  [151]  and  by  Liou  [100].  Let  Tk(P A)  be  the  series  of  the  first 
k  terms.  Liou  showed  that 


exp(PA)  -  Tk(P A)||2  < 


IIPAIlC 

(fc+1)!  [l-||PA||2/(fc  +  2)] 


(74) 


Here,  ||A||2  is  the  2-norm,  which  is  defined  as  follows:  [54] 
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where  x  is  an  iV-vector.  This  2-norm  is  calculation-intensive,  and  it  may  be  worth 
the  loss  in  bounding  effectiveness  to  replace  it  with  the  Frobenius  norm  when  N  is 
small  [17]: 

P||2  <  \\A\\f  = 


\  i= 1  j= 1 


(76) 


There  apparently  has  been  no  discussion  in  the  literature  about  problems  implement¬ 
ing  this  error  bound.  Specifically,  when  k  approaches  ||PA||  -  2,  the  denominator  of 
Equation  (74)  approaches  zero,  causing  the  algorithm  to  stop,  no  matter  how  large 
the  error  term.  On  the  other  hand,  as  long  as  ||PA||  is  not  very  close  to  any  integer 
greater  than  one,  this  situation  never  occurs.  An  easy  way  to  avoid  it  is  to  require 
||PA||  <  2.  More  will  be  said  on  the  issue  of  the  magnitude  of  ||PA||  shortly. 

Standish  showed  (in  a  larger  context)  that,  if  P  is  a  transition  matrix, 


|exp(PA)  —  Tk(P A)||r  < 


Afc  —  1 


(WP^hf 


k\ 


(77) 


where  ||PA||r  is  the  Tchebycheff  norm  of  PA  in  RN* ,  defined  by 


||PA||x  =  max(|[PA]jj|)  over  i,j  =  l,2...N 


(78) 


Golub  and  Van  Loan  point  out  that  ||PA||r  <  ||PA||2  <  ^ll^Allr,  leading  to 
the  possibility  of  roughly  comparing  the  error  bounds  of  Standish  and  Liou.  Suppose 
||PA|]r  =  ||PA||2,  and  assume  for  now  that  ||PA||2  <1-  If  k  is  fixed,  two  monoton- 
ically  increasing  error  bounds  result.  An  example  is  seen  in  Figure  38  for  the  case 
of  k  =  2.  Liou’s  bound  on  the  error  is  smaller  (and  hence  more  accurate)  for  all 
but  very  small  values  of  ||PA||2.  For  larger  values  of  k,  the  crossover  point  where 
both  bounds  are  equal  increases,  but  does  not  exceed  ||PA||2  =  0.2  when  k  <  100. 
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Figure  39  shows  a  log-log  plot  of  the  error  bounds  for  ||PA||2  =  0.1.  While  it  shows 
Standish’s  bound  dominating  for  much  of  the  range  of  k,  the  amount  of  dominance  is 
masked  by  the  log  scale.  Even  for  those  values  of  ||PA||2  and  k  for  which  Standish’s 
bound  is  superior,  it  is  not  substantially  so,  if  ||PA||r  =  ||PA||2. 


Figure  38.  Error  bounds  versus  the  norm  for  a  Maclaurin  series  approximation 
when  ||PA||r  —  |[PA||2  =  0.1. _ _ 

Requiring  ||PA||j’  =  ||PA||2  puts  the  ratio  of  Standish  s  bound  to  Liou  s  bound 
at  its  highest  value  over  all  possible  NxN  matrices.  At  its  minimum  possible  value, 
the  ratio  is  a  factor  of  N  smaller.  For  sufficiently  large  N  and  the  right  P,  then, 
Standish’s  bound  could  be  substantially  superior  to  Liou’s,  regardless  of  the  size  of 
k  and  ||PA||r-  The  analysis  shows  that  neither  bound  dominates  in  all  cases,  so  a 
stopping  criterion  for  the  Maclaurin  exponentiation  program  in  Section  H.l  is  formed 
from  the  combination  of  the  two. 

Moler  and  Van  Loan  pointed  out  that  algorithms  based  on  series  truncation 
are  typically  ineffective  for  large  values  of  ||PA||  [108],  true  regardless  of  how  the 
norm  is  computed.  If  ||PA||  >  1,  there  is  a  “hump”  in  the  graph  of  ||exp(PA)|| 
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Figure  39.  Error  bounds  versus  k  for  the  Maclaurin  series  approximation  when 

_ ll-P Ajjr  =  |[P  A||2  =  0.1. _ 

vs  ||PA||  that  increases  the  number  of  terms  needed  for  a  given  accuracy.  They 
recommended  adding  a  scale- and-square  algorithm  to  such  approaches.  This  is  based 
on  the  identity  exp(A)  =  (exp (A/y))y.  Choose  m  such  that  ||PA  /2m||  <  1,  then 
find  exp  (PA  /22"1)  by  some  method.  Finally,  recursively  square  the  result  m  times 
to  obtain  exp(PA).  While  formal  error  analysis  of  this  scale-and-square  routine 
has  not  been  accomplished,  it  appears  that  little  precision  is  lost  from  scaling  even 
by  216.  Moler  and  Van  Loan  reported  that  this  scaling  and  squaring,  combined 
with  Maclaurin  series  truncation,  is  one  of  the  most  effective  methods  of  matrix 
exponentiation  known  [108]. 

G.2  Pade  Approach 

This  approach  is  based  on  an  extension  of  Maclaurin  series  representation 
from  polynomials  expressions  to  ratios  of  polynomial  expressions.  The  (p,  q)  Pade 
approximation  to  f(x)  is  the  ratio  r(x)  =  Npq/Dpq  for  which  each  of  the  derivatives 
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of  r(x)  is  equal  to  the  corresponding  derivative  of  f(x),  where  p  and  q  are  the  degrees 
of  Npq  and  Dpq ,  respectively.  For  exponentiation,  it  can  be  shown  that  [98] 


N„(Q)  =  £ 

j= 0 

D„(Q)  =  £ 

j- 0 


(P  +  g  -  j)¥  Qj 

(p  +  q)'j'{p  -  j)- 

{p  +  q-j)\q\  Q1 

(p  +  qVMq  -  j)- 


(79) 


The  most  common  Pade  approach  to  exponentiation  is  to  set  p  —  q  (diagonal  ap¬ 
proximation),  first  applying  a  scale-and-square  procedure  similar  to  that  above,  and 
for  the  same  reasons.  The  restriction  to  diagonal  approximations  is  particularly 
important  when  Q  has  widely  divergent  eigenvalues  [108].  An  algorithm  is  pro¬ 
vided  in  Golub  and  Van  Loan  [54] ,  and  a  corresponding  program  is  provided  in  the 
MATLABtm  software  package.  This  program  is  employed  in  the  analysis  below. 


G.3  Cayley- Hamilton  Approach 


A  corollary  of  the  Cayley-Hamilton  theorem  states  that  every  well-defined  func¬ 
tion  of  an  N  x  N  matrix  can  be  expressed  precisely  as  a  polynomial  function  of  the 
matrix  of  degree  N  —  1: 

exp(PA)  =  £  aj(PAy  (80) 


j=o 


The  dj  are  complex  scalars  and  are  defined  by  the  system  of  equations 


exp(Aj)  =  ^2  ^(Xi)3 

3=0 


(81) 


Here,  the  A,  are  the  distinct  eigenvalues  of  PA.  If  A,  has  multiplicity  m,  the  addi¬ 
tional  equations 
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also  hold.  These  represent  the  jth  derivatives  of  each  side  of  Equation  (81).  The 
equations  are  inverted  to  obtain  each  a,j,  and  the  exponential  can  then  be  found 
using  Equation  (80). 

Such  an  approach  appears  highly  suitable  for  numerical  determination  of  the 
exponential  when  N  is  small;  there  is  no  truncation  error,  and  the  most  complex 
numerical  operations  are  inversion  of  an  N  x  N  matrix  and  finding  powers  of  PA. 
It  would  seem  that,  since  the  only  source  of  error  is  in  these  simple  calculations, 
precision  would  be  very  good. 

Unfortunately,  when  calculating  Equation  (80)  on  a  floating-point  machine, 
there  is  a  source  of  error  sometimes  called  “catastrophic  cancellation”  [108].  Suppose 
that  one  is  working  with  word  size  of  15  decimal  digits  (typical  for  double-precision 
machines)  and  that  two  terms  of  this  equation  are  very  nearly  additive  inverses.  If  the 
series  should  sum  to  a  number  that  is  a  factor  of  10j  smaller  in  magnitude  than  the 
magnitude  of  its  maximum  term,  one  may  not  reasonably  expect  precision  of  more 
than  15  —  j  digits.  This  catastrophic  cancellation  is  inherent  in  the  Cayley-Hamilton 
approach  and  becomes  a  common  problem  when  the  eigenvalues  span  several  powers 
of  ten. 

As  a  result,  the  precision  of  the  method  does  not  reach  that  of  the  floating 
point  word  size.  Precision  is  defined  here  as  the  maximum  error  over  all  elements 
of  the  exponential.  In  the  exponentiation  program  in  Section  H.3,  the  precision  of 
the  Cayley-Hamilton  approach  is  measured  roughly  by  finding  the  maximum  ratio 
obtained  by  dividing  each  of  the  series  terms  of  a  matrix  element  by  the  element, 
then  taking  the  maximum  of  this  quantity  over  all  the  elements.  The  number  of 
digits  of  precision  is  taken  to  be  the  difference  of  the  number  of  digits  in  the  word 
size  and  the  base  10  logarithm  of  the  maximum  ratio.  The  user  is  alerted  when  this 
procedure  detects  catastrophic  cancellation.  This  approach  is  heuristic;  it  will  not 
catch  all  severe  errors,  but  will  catch  many. 
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Precision  is  also  lost  when  eigenvalues  are  confluent  or  nearly  so,  apparently 
due  to  the  inversion  of  the  matrix  of  coefficients.  Examples  of  both  these  types  of 
errors  will  be  shown  below. 

G-4  Jordan  Approach 

A  number  of  decomposition  approaches  are  possible,  based  on  the  fact  that 
if  PA  =  SBS~\  then  exp(PA)  =  Sex^S”1  [108],  The  methods  strive  for  two 
conflicting  objectives:  forcing  B  to  be  close  to  diagonal  so  that  exp(5)  is  easy  to 
calculate,  and  making  S  well-conditioned  so  that  errors  are  not  magnified  [108].  The 
two  most  commonly  encountered  approaches  are  the  Jordan  canonical  form  and  the 
Schur  transformation.  The  Jordan  transformation  emphasizes  the  first  objective, 
while  the  Schur  transformation  emphasizes  the  second. 

In  the  Jordan  transformation,  the  goal  is  to  obtain  B  in  Jordan  form,  in  which 
the  matrix  is  block  diagonal,  with  each  block  being  either  diagonal  or  bidiagonal, 
with  its  diagonal  elements  equal  and  its  superdiagonal  elements  all  equal  to  one. 
In  this  form,  exp (B)  is  easy  to  calculate  analytically,  and  the  difficult  part  is  in 
finding  S.  Wang  depended  heavily  on  this  approach  to  matrix  exponentiation  for 
evaluation  of  appointment  schedule  costs  as  a  means  of  simplifying  calculations  [163]. 
A  version  of  this  approach  is  provided  in  the  MATLAB  software,  and  this  was  used 
for  comparison  purposes. 

The  approach  is  capable  in  theory  of  dealing  with  precisely  confluent  eigen¬ 
values.  However,  Parlett  pointed  out  that  if  PA  is  defective,  B  is  a  discontinuous 
function  of  the  eigenvalues  of  PA  [123].  Thus,  in  situations  in  which  eigenvalues 
are  confluent  or  nearly  so,  a  small  roundoff  error  results  in  a  very  large  change 
to  exp  (PA).  As  will  be  seen,  unrealistic  results  are  frequently  generated  by  this 

method. 
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G.5  Parlett  Approach 

One  of  the  convenient  forms  sought  when  employing  decomposition  methods 
is  upper  triangular,  typically  obtained  by  a  Schur  transformation.  Here,  a  trans¬ 
formation  is  unnecessary,  since  PA  is  already  of  this  form.  Once  in  this  form,  the 
matrix  is  commonly  exponentiated  using  Parlett ’s  approach.  Parlett  demonstrated 
a  simple  recursive  relation  for  well-defined  functions  of  block- triangular  matrices,  of 
which  triangular  matrices  are  a  special  case.  Define  T  =  PA  and  F  =  exp(T).  The 
relation  is  based  on  the  facts  that  F  has  the  same  block  structure  as  T  and  that 
FT  =  TF.  From  these  two  properties,  it  can  be  shown  that  [124] 

Fr<r  =  exp (Tr<r)  for  r  =  1, 2, ...  N 

s—r—l 

Tr  rFr  s  —  Fr  sTS)S  —  (Fr^r+kTr+k,s  ~  Tr'S-kTs~k}s)  for  r  <  s  (83) 

k= 0 

where  FTiS  may  be  an  element  or  a  rectangular  block.  Like  the  Cayley-Hamilton  ap¬ 
proach,  this  approach  suffers  from  numerical  difficulties.  The  flaw  in  this  particular 
approach  is  that  as  the  difference  of  the  ith  and  jth  eigenvalues  vanishes,  the  numer¬ 
ator  and  denominator  of  the  expression  for  the  (i,j)  element  of  the  exponential  also 
both  vanish  (for  upper  triangular  matrices).  Repeated  divisions  of  this  type  can  lead 
to  gross  inaccuracies. 

One  approach  to  dealing  with  precisely  confluent  eigenvalues  is  to  modify  them 
slightly,  still  staying  within  the  acceptable  error  for  the  problem,  but  ensuring  they 
are  far  enough  apart  to  avoid  large  errors.  This  approach  is  built  into  the  Par¬ 
lett  exponentiation  code  in  Appendix  H.3,  but  the  analysis  in  the  next  section  will 
show  that  it  is  a  dangerous  way  to  proceed,  even  when  using  quadruple  precision 
arithmetic. 

Parlett  suggested  the  possibility  of  applying  a  similarity  transform  that  would 
reposition  the  confluent  eigenvalues  into  the  same  upper  triangular  block  [123].  The 
block  could  be  exponentiated  analytically,  then  the  block  version  of  Parlett ’s  method 
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could  be  applied.  This  might  be  applicable  to  the  problem  at  hand,  but  it  would 
destroy  the  convenient  upper  triangular  structure  and  create  blocks  that  could  be 
nearly  the  size  of  the  entire  matrix,  putting  the  problem  solver  in  a  potentially 
worse  position  than  when  he/she  started.  Further,  such  a  block  scheme  would  only 
be  helpful  if  the  eigenvalues  were  precisely  confluent.  Parlett  did  not  comment  on 
inaccuracies  arising  from  nearly  confluent  eigenvalues,  which  will  be  seen  shortly  to 
be  substantial. 

G.  6  Selection  of  the  Most  Effective  Approach 

Given  the  goals  set  forth  above  and  the  problems  that  are  inherent  in  the 
approaches  discussed,  the  most  accurate  approach  must  be  determined,  with  short 
computation  time  being  a  secondary  goal.  To  do  so,  a  set  of  benchmarks  is  required 
for  which  the  exponential  is  known  to  sufficient  accuracy.  The  test  matrix  chosen  to 
compare  results  is 

-1.0  0.1  0.9  0.0  0.0  0.0 

0.0  -1.0  -6  1.0  +  5  0.0  0.0  0.0 
0.0  0.0  -1.0  +  6  1.0  -  8  0.0  0.0 

(84) 

0.0  0.0  0.0  -1.0-  25  0.5  + (5  0.5  +  5 

0.0  0.0  0.0  0.0  A  -A 

0.0  0.0  0.0  0.0  0.0  0.0 

With  appropriate  choice  of  parameters  6  and  A,  this  matrix  provides  some  insight  into 
the  behavior  of  the  various  methods  in  the  presence  of  confluent  or  nearly  confluent 
eigenvalues,  as  well  as  eigenvalues  that  are  widely  separated. 

For  the  Maclaurin,  Cayley-Hamilton,  and  Parlett  approaches,  the  programs  in 
Appendix  H  were  employed  on  a  Pentium™  processor  using  double  precision  arith¬ 
metic,  which  at  best  can  give  14  or  15  places  of  precision.  For  the  Pade  and  Jordan 
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approaches,  routines  supplied  with  the  MATLAB  software  package  were  employed. 
These  also  used  the  Pentium’s  double  precision  arithmetic  routines. 

First,  the  question  of  nearly  confluent  eigenvalues  was  addressed  by  fixing 
\  =  _20  and  modifying  6.  Since  analytic  exponentiation  of  this  matrix  is  oppressive, 
results  using  the  Maclaurin  approach  with  quadruple  precision  were  employed  as  the 
benchmark.  Results  are  in  Table  23.  Entries  refer  to  the  maximum  number  of 
decimal  places  for  which  every  matrix  entry  agrees  with  the  actual  result,  without 

rounding. 

Consider  the  results  of  Parlett’s  method.  For  6  =  1CT5,  ii  E  =  exp(Q),  then 
JF(1,4)  should  be  approximately  0.17167.  To  obtain  £'(1,4),  Parlett  s  method  per¬ 
forms  four  exponentiations  initially  to  obtain  the  diagonal  elements  of  £,  after  which 
it  performs  only  21  additions  and  subtractions,  14  multiplications,  and  6  divisions. 
Using  double  precision  arithmetic  yields  0.172213,  in  error  by  0.3%.  Clearly  impos¬ 
sible  results  ensue  for  smaller  b.  The  same  calculation  in  quadruple  precision  yields 
only  5  places  of  accuracy  when  6  —  10-10  and  fails  completely  when  <5  —  10 

Moler  and  Van  Loan  commented  that  it  would  be  interesting  to  compare  the 
Parlett  approach  to  a  scaling-and-squaring  approach,  and  they  hinted  that  the  Par¬ 
lett  approach  would  prove  more  accurate  [108].  The  above  results  for  the  Maclaurin 
and  Pade  approaches  indicate  that  the  opposite  is  true  in  the  presence  of  nearly 
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confluent  eigenvalues.  Because  of  this  degradation,  Parlett’s  approach  will  not  be 
considered  further. 

Likewise,  The  Cayley-Hamilton  method  yields  E{  1,4)  =  0.171156,  in  error  by 
0.3%,  showing  its  susceptibility  to  nearly  confluent  eigenvalues  as  well.  The  Jordan 
method  was  equally  poor.  Conclusions  regarding  the  relative  effectiveness  of  these 
two  algorithms  should  be  avoided,  since  they  were  implemented  in  different  languages 
by  different  people.  However,  it  is  clear  that  both  approaches  are  inadequate  for  the 
purpose  at  hand.  This  poor  performance  of  the  Jordan  algorithm  in  the  presence  of 
nearly  confluent  eigenvalues  was  anticipated  by  Moler  and  Van  Loan  [108]. 

The  Maclaurin  and  Pade  approaches  used  nine  and  six  terms,  respectively,  to 
obtain  each  of  the  results  in  Table  23.  More  terms  did  not  appear  to  help  achieve 
better  convergence,  although  as  few  as  three  terms  were  required  to  obtain  14-place 
accuracy  in  some  cases.  These  methods  appear  highly  accurate  and  insensitive  to 
the  presence  of  nearly  confluent  eigenvalues. 

The  analysis  above  used  the  results  of  a  quadruple  precision  Maclaurin  routine 
as  a  benchmark.  This  is  inelegant  and  may  lead  to  problems  if  the  routine  itself 
is  suspect.  For  the  majority  of  matrices,  this  may  be  the  only  alternative,  since 
analytic  results  are  difficult  to  obtain  and  result  in  very  long  expressions. 

If  some  regularity  is  present  in  Q ,  however,  analytical  results  may  be  obtain¬ 
able,  albeit  with  some  effort.  Consider  the  matrix  formed  when  6  =  0: 

’  -1.0  0.1  0.9  0.0  0.0  0.0 

0.0  -1.0  1.0  0.0  0.0  0.0 

0.0  0.0  -1.0  1.0  0.0  0.0 

Q  = 

0.0  0.0  0.0  -1.0  0.5  0.5 

0.0  0.0  0.0  0.0  A  -A 

0.0  0.0  0.0  0.0  0.0  0.0 
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Exponentiation  can  be  performed  analytically  most  simply  in  this  case  by  first 
making  the  following  definitions: 


a 

b 

0 

C 

E  =  exp(Q)  = 


A 

B 

0 

C 

(86) 


where  a  is  4  x  4.  Parlett’s  result  can  be  used  to  obtain 


A  =  exp  (a)  C  =  exp(c) 

(87) 

Ab-bC  =  aB  —  Be 

(88) 

The  first  two  equations  are  solvable  analytically  by  the  Cayley-Hamilton  approach, 
after  which  the  third  produces  a  series  of  eight  simultaneous  equations  that  can  be 
solved  analytically  by  sequential  substitution.  The  results  are 


i  i  1?  i 

1  10  20  15 

0  1  1  | 

0  0  11 
0  0  0  1 


c  = 


ex  1  -  eA 
0  1 


B  = 


-i+eA+1 

2e(l+A) 

~(2+A)+ex+1 

2e(l+A)2 

~(A2+4A+5)+2ex+1 

4e(l+A)3 


—  (2A+l)+2e(l+A)— eA+1 
2e(l+A) 

— (4A2+7A+2)+2(A+l)z— eA+1 
2e(l+A)2 

— (10A3+29A2+26A+5)+4e(l+A)3— 2eA+1 
4e(l+A)3 


— (28A3+141A2+258A+151)  )  (  -(302A4+1180A3  +  1671A2+950A+151) 

+6eA+HlO+9A)  J  y  — 6eA+1(10+9A)+120e(l+A)4 

120e(l+A)4  120e(l+A)4 


(89) 


As  long  as  |1  +  A|  is  not  very  small,  the  accuracy  of  these  expressions  should  be  near 
the  word  size  on  most  floating-point  machines.  These  expressions  were  used  to  test 
the  candidate  approaches  for  robustness  under  widely  differing  eigenvalues,  as  well 
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Table  24.  Accuracy  of  exp(Q)  as  an  eigenvalue  diverges.  Number  of  decimal  places 


of  accuracy  are  given. 


A 

Maclaurin(lO) 

Maclaurin(9) 

Maclaurin  (4) 

Pade(6) 

Pade(3) 

-2  •  10“8 

12 

10 
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14 

9 

-2  •  nr1 

12 

10 

3 

14 

9 

-2  ■  10° 

12 

10 

3 

14 

14 

-2  •  101 

14 

13 

7 

13 

13 

BHiSfl 
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12 

10 

12 

HuH 

13 

11 

13 

12 

(NEJM 

EESH 

12 

11 
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11 
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11 

11 

11 

11 

11 

11 

11 

9 

9 

11 

11 

11 

8 

8 

9 

8 

8 

8 

8 

as  in  the  presence  of  precisely  confluent  eigenvalues.  The  results  are  shown  in  Table 
24. 

While  not  included  in  Table  24,  for  A  €  [-2000,-1],  the  Cayley  Hamilton 
algorithm  achieved  relatively  constant  maximum  error  of  6  •  10~7.  For  A  <  — 2  •  10-5, 
the  routine  gave  unrealistic  results. 

For  an  accuracy  of  10-7,  Liou’s  rule  requires  ten  terms  of  the  Maclaurin  series 
for  each  value  of  A,  which  is  far  more  than  necessary  for  most  values.  The  Maclaurin 
truncation  rules  were  obviated  for  this  analysis,  and  the  MATLAB  Pade  routine  was 
modified,  to  enable  one  to  compare  results  using  a  constant  number  of  terms.  The 
number  of  terms  used  in  each  calculation  is  shown  in  parentheses  in  each  column  of 
Table  24. 

The  evaluations  of  a  small  number  of  terms  of  the  two  series  reveals  that 
convergence  is  very  quick  over  a  range  of  A,  but  is  slower  above  and  below  this  value. 
Accuracy  becomes  more  uniform  over  a  broader  range  of  A  as  the  number  of  terms 
increases. 

MATLAB  employs  the  Pade  algorithm  offered  by  Golub  and  Van  Loan,  who 
point  out  that  it  requires  on  the  order  of  2(k  +  m  +  \)NZ  flops.  A  flop  is  defined  as 
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a  single  floating-point  operation,  while  k  is  the  number  of  terms  evaluated  and  m 
is  the  number  of  half-scalings  required  [54] .  This  is  for  a  general  matrix;  since  the 
matrices  here  are  triangular,  only  |iV3  rather  than  2 iV3  flops  are  needed  for  each 
matrix  multiplication,  reducing  the  Pade  requirement  to  |(fc  +  m  +  1)N3  [54],  The 
Maclaurin  series  approach  also  requires  on  the  order  of  |(fc  +  m  +  1)N  ,  as  it  turns 
out.  Thus,  one  can  compare  the  algorithmic  efficiency  of  the  two  by  considering  the 
number  of  terms  required  by  each  to  achieve  a  given  accuracy. 

For  the  family  of  matrices  analyzed  here,  the  number  of  terms  of  the  Maclaurin 
series  to  reach  the  accuracy  of  the  6-term  Pade  series  was  9,  with  between  0  and  6 
scalings  required.  Thus,  the  relative  efficiency  of  the  Pade  and  Maclaurin  approaches 
varied  between  »  1.43  and  g±f±±>  «  1-23.  Moler  and  Van  Loan  reported 

that,  when  combined  with  a  scale-and-square  routine,  the  Pade  approach  achieved 
similar  precision  to  that  of  the  Maclaurin  series  approach  using  approximately  half  as 
many  terms  [108],  This  would  imply  a  higher  relative  efficiency  for  the  Pade  approach 
than  that  calculated  here,  but  still  less  than  2.0. 

The  Pade  approach  has  been  shown  here  to  be  more  efficient  for  representative 
matrices,  possibly  1.2  to  2  times  so.  However,  while  the  improvement  in  efficiency 
may  make  it  worth  coding  the  Pade  approach  for  some  problems,  it  appears  that  the 
Maclaurin  series  approach  is  adequate  for  the  matrices  to  be  dealt  with  here,  and  it 
will  be  employed. 

Other  approaches  are  possible.  Since  the  matrix  represents  a  system  of  dif¬ 
ferential  equations,  Runge-Kutta  approximation  methods  can  be  used.  Lagrange 
interpolation  is  also  possible,  which  yields  expressions  similar  to  those  obtained  by 
Parlett’s  method.  Engineers  often  use  a  method  based  on  inverse  Laplace  trans¬ 
forms,  which  turns  out  to  be  related  to  the  Cayley-Hamilton  approach  [108].  Each 
of  these  approaches  has  been  found  deficient  by  other  researchers,  so  they  were  not 

tested  [108]. 
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G.7  Software  Concerns 

Several  observations  may  be  of  use  to  the  analyst  using  mathematical  soft¬ 
ware  for  the  PC  (personal  computer).  Floating  point  problems  were  observed  in 
Microsoft’s  PowerStation™  ,  a  FORTRAN  compiler  for  PCs.  Errors  as  high  as 
40%  originally  were  obtained  for  the  test  matrix  using  the  Parlett  program.  These 
stemmed  from  two  sources.  One  is  the  documented  problem  with  all  versions  of  this 
compiler  in  which  double  precision  numbers  must  be  initialized  with  double  preci¬ 
sion  constants;  the  statement  a  =  0.  causes  errors  in  the  last  32  of  the  64  bits  in  a 
double-precision  word,  while  a  =  0.0D  +  0  does  not.  The  error  arises  whether  using 
a  32-bit  or  16-bit  processor.  However,  the  majority  of  the  error  was  caused  by  an 
undocumented  error  generated  when  double  and  single  precision  numbers  are  used 
in  the  same  calculation.  The  manufacturer  has  acknowledged  this  error  [106].  After 
modifying  the  code  to  eliminate  these  compiler  errors,  results  were  identical  to  those 
obtained  on  other  compilers  [23]. 

The  MATLAB  software  package  has  two  problems  in  its  matrix  exponentia¬ 
tion  routines.  As  noted  above,  its  implementation  of  Jordan  decomposition  gives 
unacceptably  high  errors  in  tests  on  the  test  matrix.  Worse,  one  can  obtain  negative 
and  even  infinite  results  without  invoking  a  warning  from  the  error-handling  routine 
in  the  program.  The  manufacturer  considers  this  program  to  be  of  pedagogic  value 
only  [109]. 

In  tests  of  MATLAB’s  Maclaurin  series  algorithm  on  the  test  matrix,  the  per¬ 
formance  was  poor  compared  to  the  FORTRAN  subroutine  EXP  in  Section  H.l.  For 
x  E  [-200,  -1],  its  precision  was  only  10~6.  For  more  negative  values  of  A,  unreal¬ 
istic  results  were  obtained.  Addition  of  a  scale-and-square  routine  to  the  program 
produced  precision  near  10  ^  for  all  values  of  A  tried.  The  efficacy  of  this  simple 
change  has  been  acknowledged  by  the  manufacturer  [109]. 

Due  to  previously  reported  problems  with  versions  of  the  Pentium  processor 
and  the  need  to  track  down  numerical  anomalies  (which  transpired  to  be  the  corn- 
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piler  errors  discussed  above),  floating  point  results  from  the  Pentium  and  a  Sun 
SparcStationrM  workstation  were  compared.  Both  use  two  32-bit  words  for  double 
precision  storage  and  manipulation.  The  author  conjectured  that,  since  the  Pentium 
has  several  rounding  modes  and  the  default  for  FORTRAN  compilers  is  its  truncation 
mode,  perhaps  the  last  digit  might  differ.  This  could  create  substantial  differences 
in  final  results,  since  routines  like  the  Parlett  algorithm  are  only  marginally  stable, 
as  shown  above.  Even  after  a  number  of  operations  designed  to  degrade  accuracy  to 
only  five  places,  the  double  precision  arithmetic  results  produced  by  the  Sun  FOR¬ 
TRAN  compiler  and  those  produced  internal  to  the  Pentium  were  the  same  (albeit 
incorrect)  to  15  digits.  It  appears  that  the  Pentium  does  indeed  conform  to  IEE 
Standards  754  and  854  for  floating  point  storage  and  arithmetic  [67], 
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H.l 


Appendix  H.  Computer  Programs 

Sequence/ Schedule  Optimization  Program 


FORTRAN  code:  Following  are  the  programs  referenced  in  this  dissertation.  These 
particular  versions  of  the  routines  are  the  ones  used  in  the  validation  of  the  greedy 
sequencing  algorithm  and  are  presented  in  top-down  order.  For  ease  in  reference, 
Table  25  shows  the  structure  of  the  program,  with  brief  descriptions  of  the  function 
of  each  major  routine  and  the  page  number  on  which  it  can  be  found. 

Table  25.  Program  structure  and  subroutine  index 

-  COMMONS  (Page  201):  Used  by  INCLUDE  statements  throughout. 

-  HCSEARCH  (Page  202):  Program  to  compare  the  global  optimum  and  the  optimum 
found  by  the  greedy  sequencing  algorithm  for  a  series  of  problems. 

-  FPFLAW  (Page  226):  Checks  for  Pentium  chip  flaw. 

-  NORMDIST  (Page  208):  Returns  a  normally  distributed  variate. 

-  RECORD1  (Page  207):  Records  data  used  in  each  run. 

-  SEQUENCE2  (Page  209):  Shell  that  takes  the  place  of  a  sequencing  program 
used  in  another  version  of  the  code. 

-  FIXEDLATTICE  (Page  210):  Finds  the  optimal  schedule  for  a  given  sequence. 

—  BUILD Q  (Page  221):  Builds  the  transition  matrix  Q. 

—  MATMULT  (Page  225):  Multiplies  two  upper-triangular  matrices. 

—  OMEGAEVAL  (Page  222):  Builds  the  conditional  probability  matrix  ft. 
—  EXP  (Page  223):  Performs  matrix  exponentiation. 

—  EVALUATE  (Page  214):  Performs  cost  evaluation  algorithm. 

—  PEXTEND  (Page  216):  Modifies  waiting  times  using  Equation  (7). 
—  FATHOM  (Page  211):  Finds  SE  (Sl)  using  the  fixed-lattice  algorithm. 

—  ENUMERATE  (Page  212):  Evaluates  the  necessary  schedules  between  SE 
and  Sl  to  determine  the  optimum. 

—  FLIP  (Page  214):  Recursive  subroutine  performs  binary  enumeration. 
—  EVALUATE  (Page  214):  Performs  cost  evaluation  algorithm. 

-  GREEDYSEQ  (Page  217):  Performs  the  greedy  sequencing  algorithm. 

_  BUILDS AME  (Page  219):  Generates  matrix  indicating  if  two  customers 

are  of  the  same  class.  IF  so,  GREEDYSEQ  need  not  swap  them. 
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MODULE  SETSIZE 

! variables  shared  by  most  routines  to  set  array  sizes 
(MAXCUST  and  MAXPHASES  are  maximum  number  of  customers  and  phases 
! allowable  in  problem.  They  need  be  changed  only  here.  Execution 
! speed  is  dep  on  these  variables,  so  set  them  small  as  practicable. 
INTEGER  LDA , MAXCUST , MAXPHASES , K , N , OT 
PARAMETER (MAXPHASES=4 ) 

PARAMETER (MAXCUST=6 ) 

PARAMETER (LDA= (MAXPHASES+1 ) * (MAXCUST+1 ) ) 

PARAMETER (MAXSEQ=72 1 ) 

REAL* 8  DELTA 

COMMON  /SIZEDATA/  K,N,0T, DELTA 
END  MODULE 

MODULE  SETQDATA 

(variables  shared  by  cost-evaluation-level  routines 

USE  SETSIZE 

INTEGER  NQ.R(MAXCUST) 

REAL*8  MU (MAXCUST, MAXPHASES) ,B (MAXCUST, MAXPHASES) 

REAL*8  GAMMA (MAXCUST+1) 

REAL* 8  Q (LDA , LDA) , E (LDA , LDA) , OMEGA (MAXCUST+1 , LDA) 

COMMON  /QDATA/  NQ,R, GAMMA, OMEGA, Q,E 
END  MODULE 

MODULE  SETCOSTDATA 

(variables  shared  by  schedule-optimization-level  routines 
USE  SETSIZE 

INTEGER  NIT, FLAG, OTSLOT 

REAL* 8  CW(MAXCUST+1) , HORIZON, OTPOINT 

COMMON  /COSTDATA/  CW, NIT, FLAG, HORIZON, OTSLOT, OTPOINT 

END  MODULE 

MODULE  SETSEQDATA 

(variables  shared  by  sequence-optimization-level  routines 
USE  SETSIZE 

PARAMETER (ACCURACY=1 . OD-6) 

INTEGER  RSAVE (MAXCUST) ,SBEST (MAXCUST) , NBEST 
INTEGER  SQ(MAXSEQ, MAXCUST) ,NSEQ,NINIT(2) 

REAL *8  MUSAVE (MAXCUST, MAXPHASES) ,BSAVE (MAXCUST, MAXPHASES) 

REAL* 8  GAMMASAVE (MAXCUST+1) .CWSAVE (MAXCUST+1) 

REAL*8  C(0:MAXSEQ) ,WBEST (MAXCUST+1) 

CHARACTER* 10  ALPH(MAXSEQ) 

COMMON  SEQDATA/ALPHINIT , NINIT , ALPH , C , SQ , MUSAVE , BSAVE , 

1  GAMMASAVE , CWSAVE , RSAVE , SBEST , WBEST , N 

END  MODULE 
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MODULE  EXPERIMENTDATA 

! variables  shared  by  experiment-level  routines 
USE  SETSIZE 

INTEGER  NEXPT , MAXPICK , NPICK 
PARAMETER (MAXPICK=7 ) 

REAL*8  PHI2CMAXCUST) ,PHI3 (MAXCUST) .MEAN (MAXCUST) 

REAL*8  PICKPHI2 (MAXPICK) .PICKPHI3 (MAXPICK) 

REAL *8  PICKR (MAXPICK) .PICKMUl (MAXPICK) 

REAL* 8  PICKMUR (MAXPICK) .PICKBl (MAXPICK) 

COMMON  /EXPTDATA/NEXPT , PHI2 , PHI3 , MEAN 
END  MODULE 

|  ******************************* 

PROGRAM  HCSEARCH 

! varies  problem  parameters  and  finds  optimum  by  exhaustive 
! enumeration .  Then  it  uses  a  greedy  algorithm  from  two 
(starting  points  to  approximate  the  optimum.  Parameters  altered 
! in  this  version  are  CW(N+1)  and  HORIZON,  keeping  number 
!of  slots  constant. 


!  inputs : 

(SEQUENCE  contains  each  possible  sequence  (alphanumeric) 
(DISTRIB  contains  Coxian  parameters  for  specific  3-moment  sets 


!  outputs : 

(OUT  contains  parameters  used,  optimum  and  near-optimal 
!  COUNTER . DAT  contains  results  of  greedy  algorithm 


USE  MSFLIB 
USE  SETCOSTDATA 
USE  SETQDATA 
USE  SETSEQDATA 
USE  EXPERIMENTDATA 

INTEGER  I , J , NSOFAR, VARSEQ (MAXCUST) 

INTEGER*2  IHR,IMIN,ISEC,I100 
CHARACTER* 10  HEADER*80,  ALPHABET 

REAL *8  VARMEAN, ERR, TEMP, TSTART,  TSTARTINI .WVAR(MAXCUST) 

CALL  FPFLAW 

OPEN ( 1 , FILE= » DISTRIB ’ ) 

OPEN ( 2 , FILE= ’ SEQUENCE ’ ) 

OPEN (3 , FILE= * OUTFULL ’ ) 
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OPEN (4,FILE=’ OUTMEAN ’ ) 
OPEN (5 , FILE= ’ OUTRAND ’ ) 
OPEN (6 , FILE= ’ ENUMOUT ’ ) 
OPEN (7 , FILE= ’ OUTVAR ’ ) 
ALPHABET= ’ ABCDEFGHI J ’ 
NEXPT=0 
K-ll 


WRITE (*,*)’ STARTING  EXPERIMENT  NUMBER:’ 

READ(*,*)NEXPT 

NEXPT=NEXPT- 1 

WRITE ( * , * ) ’ NUMBER  OF  CUSTOMERS : ’ 

READ(* , *)N 

WRITE ( * , * ) ’ RANDOM  SEED : ’ 

READ(* , *)  I 

VARMEAN=RAND(I)  {initialize  random  stream 
WRITE (3,*) ’SEED=  ’,1 

CALL  GETTIM(IHR,IMIN, ISEC, 1100) 
TSTART=IHR*3600+IMIN*60+ISEC+REAL ( I 100 ) / 100 


!get  sequences  to  evaluate 
READ (2,*) HEADER 
DO  1=1 .MAXSEQ 

READ (2,*) (SQ(I,J) ,J=1,N) 

IF(SQ(I,1) . EQ.O)GOTO  10 

lalph  is  alpha  equivalent  of  SQ,  assuming  services  idd 
DO  J=1 ,N 

ALPH(I) ( J : J ) =ALPHABET ( SQ ( I , J) : SQ (I , J) ) 

END  DO 
END  DO 

WRITE (*,*) ’WARNING:  NOT  ALL  SEQUENCES  WERE  READ  IN’ 

10  NSEQ-I-1 


!get  service  distribution  parameters 
WRITE(3,*) 

WRITE (3,*) ’HEADER  ON  DIST  FILE:’ 

READ ( 1 , * ) HEADER 
WRITE (3,*) HEADER 
READ (1,*) HEADER 
WRITE (3,*) HEADER 
WRITE(3,*) 

DO  1=1,30 
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READ(1 , *)PICKPHI2(I) ,  PICKPHI3(I) ,PICKR(I) ,PICKMU1(I) , 
1  PICKMUR(I) .PICKBl (I) 

IF (PICKPHI2 (I) . EQ . 0 . 0)  GOTO  20 
END  DO 

20  NPICK=I-1 

VARMEAN=10.0  (variance  of  mean  distribution 

WRITE(3,*) ’K=  ’ ,K 

WRITE ( 3 ,  *  )  ’ VARMEAN=  ’ , VARMEAN 


(new  experiment  point 
30  NEXPT=NEXPT+ 1 

CALL  GETTIM(IHR,IMIN,ISEC,I100) 

TSTARTINI=IHR*3600+IMIN*60+ISEC+REAL(I100) / 100-TSTART 
TSTART=IHR*3600+IMIN*60+ISEC+REAL ( I 100) / 100 
WRITE (*,*)> EXPERIMENT:  ’ , NEXPT ,  »  TIME:  ’ .TSTARTINI 
SUMMEANS=0 


(pick  means  from  lognormal  with  log (mean)  dist  as  N(0,VAR) 
(pick  weights  from  lognormal  with  log(CW)  dist  as  N(0,0.5) 
DO  1=1, N 

CALL  N0RMDIST (MEAN (I) ) 

MEAN ( I ) =EXP (VARMEAN*MEAN (I) ) 

SUMMEAN  S=SUMMEAN  S+MEAN ( I ) 

CALL  N0RMDIST(CW(I)) 

CW(I)=EXP(0.5*CW(I)) 

END  DO 


! sort  customers  by  WSEPT 
35  DO  1=2, N 
FLAG=0 

IF(MEAN(I)/CW(I) .LT.MEAN(I-1)/CW(I-1))  THEN 
TEMP=MEAN(I) 

MEAN (I ) =MEAN (I - 1 ) 

MEAN (I - 1 ) =TEMP 
TEMP=CW(I) 

CW(I)=CW(I-1) 

CW (1-1) -TEMP 
FLAG=1 
END  IF 
END  DO 

IF (FLAG. EQ.l) GOTO  35 


(assign  phase  rates  and  transition  probabilities.  If  PHI2>1, 
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! then  choose  high  PHI3  or  low  PHI3  with  equal  probability. 
DO  1=1, N  !for  each  customer 
J=INT(5*RAND(0)+1) 

IFCJ.GT.3)  J=J+INT(2*RAND(0))*2 
PHI2(I)=PICKPHI2(J) 

WVAR(I)=(PHI2(I) -1)*MEAN(I)**2/CW(I) 

PHI3(I)=PICKPHI3(J) 

R(I)=PICKR(J)+1 

MU(I , 1)=PICKMU1( J)/MEAN (I) 

B(I, 1)=PICKB1(J) 

DO  H-2.RCI) 

B(I,H)=1.0 

MU(I,H)=  PICKMUR(J)/MEAN(I) 

END  DO 

GAMMA(I)=1.0  ! 0 . 7+0 . 3*RAND(0) 

END  DO 


!save  settings  before  shuffling 
CALL  REC0RDK3) 

RSAVE=R 

MUSAVE=MU 

BSAVE=B 

GAMMASAVE=GAMMA 
CWSAVE=CW 
LASTC0ST=0 . 0D0 

DO  J=-l ,3 

CW(N+1)=10 . 0**FL0AT(J) 

DO  1=1,20 

H0RIZ0N=FL0AT (I) / 10 . 0+SUMMEANS 
DELTA=H0RIZ0N/FL0AT(K-1) 

0TP0INT=H0RIZ0N 
0TSL0T=K-1 
NBEST=0 
C(0)=1 . 0D50 

CALL  SEQUENCE2  !find  cost  of  each  sequence 

! stop  incrementing  horizon  if  the  cost  is  tiny 
IF (C (NBEST) .LT. ACCURACY)  GOTO  150 


! starting  sequence  ordered  by  weighted  means 
NS0FAR=1 

CALL  GREEDYSEQ (ERR, ITER, NSOFAR) 
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ERRFLAG=0 

IF (ERR. GT. 0.0)  ERRFLAG=1 

WRITE (4 ,140) NEXPT , C (NSOFAR) , C (NBEST) , ITER , HORIZON , 
HORIZON/SUMMEANS , CW (N+l) , ALPH (NSOFAR) 

WRITE(3 , 140) NEXPT, C (NSOFAR) , C (NBEST) , ITER, 

HORIZON , HORIZON/SUMMEANS , CW (N+l ) , ALPH (NSOFAR) 


! random  starting  sequence 
N  S  OFAR= INT ( NSEQ  *RAND ( 0 )  + 1 ) 

CALL  GREEDYSEQ (ERR, ITER, NSOFAR) 

ERRFLAG=0 

IF (ERR. GT. 0.0)  ERRFLAG=1 

WRITE (5 , 140) NEXPT , C (NSOFAR) , C (NBEST) , ITER, 

HORIZON , HORIZON/SUMMEANS , CW (N+ 1 ) , ALPH (NSOFAR) 
WRITE(3, 140)NEXPT,C(NSOFAR) , C (NBEST) , ITER, 

HORIZON , HORIZON/SUMMEANS , CW (N+ 1 ) , ALPH (NSOFAR) 


(order  starting  sequence  by  weighted  variances 

NS0FAR=1 

DO  11=1, N 

VARSEQ(II)=H 
END  DO 
FLAG=0 
DO  11=2, N 

IF ( WV AR ( VARSEQ (II)) -LT . WVAR(VARSEQ (II-l) ) )THEN 
FLAG=1 

TEMP= V ARSEQ (II) 

VARSEQ (II) =V ARSEQ ( I I - 1 ) 

VARSEQ (II-1)=TEMP 
END  IF 
END  DO 

IF (FLAG. EQ.l) GOTO  40 


(find  index  of  variance-ordered  starting  sequence 
DO  NS0FAR=1 ,NSEQ 
DO  JJ=1,N 

IF (SQ (NSOFAR, JJ) . NE. VARSEQ (JJ)) GOTO  50 
END  DO 
GOTO  60 
END  DO 

WRITE (*,*) ’VARIANCE  START  SEQUENCE  NOT  FOUND 
WRITE (7,*) ’VARIANCE  START  SEQUENCE  NOT  FOUND  * 
CONTINUE 


(perform  greedy  sequencing  algorithm 


CALL  GREEDYSEQ (ERR , ITER, NSOFAR) 

ERRFLAG=0 

IF(ERR.GT.O.O)  ERRFLAG=1 

WRITE (7 , 140) NEXPT , C (NSOFAR) , C (NBEST) , ITER , HORIZON , 

1  HORIZON/SUMMEANS , CW (N+l) ,ALPH (NSOFAR) 

WRITE(3, 140) NEXPT ,C (NSOFAR) ,C(NBEST) ,ITER, 

1  HORIZON, HORIZON/SUMMEANS, CW(N+1) ,ALPH (NSOFAR) 

140  FORMAT  (13,  ’  ’,E12.5,’  ’.E12.5,’  M3,’  \E12.5,’  ’, 

1  E12.5,  ’  ’  ,E12.5,  ’  ’,  A<N> ,  ’  \A<N>,’  ’,A23) 

END  DO  !  I 
150  END  DO  !  J 
GOTO  30 

CLOSE(l) 

CL0SE(2) 

CL0SE(3) 

CL0SE(4) 

CL0SE(5) 

CL0SE(6) 

CL0SE(7) 

RETURN 

END 

SUBROUTINE  RECORDl(FILE) 


Itrcinsfers  input  data  to  output  file 


USE  SETQDATA 
USE  SETCOSTDATA 
USE  SETSEQDATA 
USE  EXPERIMENTDATA 
INTEGER  J, I, FILE 

WRITE (FILE,*) 

WRITE (FILE,*) 

WRITE (FILE,*) 

WRITE(FILE,*) ’EXPERIMENT  #  ’ , NEXPT 

WRITE (FILE,*) ’CUST  PHASES  MEAN  PHI2  PHI3  WEIGHT’ 

DO  J=1,N 

WRITE (FILE, 50) J,R(J) ,MEAN(J) ,PHI2(J) ,PHI3(J) ,CW(J) 
END  DO 
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50  FORMAT  (13,  ’  ’,13,’  \E12.3,’  ’.E12.3,’  ’.E12.3,’  ’.E12.3) 
WRITE (FILE,*) 

WRITE (FILE,*) ’PHASE  RATES  FOR  EACH  CUSTOMER:’ 

DO  J=1 ,N 

WRITE (FILE, 60) J, (MU(J,I) ,1=1, R(J)) 

END  DO 

60  FORMAT (13 , 32E9 . 2) 

WRITE(FILE,*) 

WRITE(FILE,*) ’TRANSITION  PROBABILITIES  FOR  EACH  CUSTOMER:’ 
WRITE(FILE,*) ’ (FIRST  LISTED  IS  SHOW  PROBABILITY)’ 

DO  J=1,N 

WRITE (FILE, 70) J,GAMMA(J) , (B(J,I) ,I=1,R(J)-1) 

END  DO 

70  FORMAT ( 13 , E9 . 2 , 32E9 . 2) 

WRITE (FILE,*) 

RETURN 

END 

j  ********************  *******  ******** 

SUBROUTINE  NORMDIST(Xl) 

! returns  a  standard  normal  random  variate 
! Adapted  from  Marsaglia  and  Bray, 

!SIAM  Review  6:260-64,  1964 


REAL *8  X1,X2,V1,V2,Y,W 
INTEGER  RESERVE 
SAVE  RESERVE, X2 

IF  (RESERVE. EQ.O)  THEN  ! generate  two  new  normal  variates 
W=2 . 0 

DO  WHILE(W.GT.l.O) 

V1=2*RAND(0)-1 
V2=2*RAND(0)-1 
W=vi**2+V2**2 
END  DO 

Y=SQRT(-2*L0G(W)/W) 

X1=V1*Y 

X2=V2*Y 

ELSE  !use  second  random  variate  generated  from  last  call 
X1=X2 
END  IF 
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RESERVE=1 -RESERVE 


RETURN 

END 

j  ************ ************ ********** 

SUBROUTINE  SEQUENCE2 

USE  SETCOSTDATA 
USE  SETQDATA 
USE  SETSEQDATA 

INTEGER  S(N+1) 

REAL* 8  W (N+l) 


!  set  parameters  to  those  of  sequence  L 
DO  L=1 ,NSEQ 
FLAG=0 
DO  J-l.N 

R( J)=RSAVE(SQ(L, J) ) 

GAMMA ( J ) =G AMMAS AVE ( S  Q ( L , J ) ) 
CW(J)=CWSAVE(SQ(L, J) ) 

DO  J  J= 1 , MAXPHASES 

B(J , JJ)=BSAVE(SQ(L, J) , JJ) 

MU(J, JJ)=MUSAVE(SQ(L, J) , JJ) 

END  DO 

40  END  DO 


! optimize  schedule  for  customer  L 
CALL  FIXEDLATTICE(C(L),W,S) 
IF(C(L) .LT.C(NBEST))  THEN 
NBEST=L 
SBEST=S 
! WBEST=W 
END  IF 

90  END  DO 

RETURN 

END 


|  ********************************** 
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SUBROUTINE  FIXEDLATTICE(C2,W2,S2) 

!f ind  optimal  cost  and  schedule  for  a  given  sequence 


USE  SETCOSTDATA 

REAL*8  Cl ,C2, ERROR ,W1(N+1) ,W2(N+1) 

INTEGER  I , APARTFLAG , SI (N+l ) , S2 (N+l) , S (N+l ) , HIT1 , NIT2 , CHECKSUM 


! build  matrix  Q,  determine  matrix  size 
!also  build  expected  wait  matrix  OMEGA 
CALL  BUILDQ 


! build  conditional  wait  matrix  OMEGA 
CALL  OMEGAEVAL 

! f ind  E=EXP(Q*DELTA) . 

!0nly  need  to  do  this  when  lattice  size  changes 
ERROR  =  1.0D-7 
CALL  EXP (ERROR) 


!put  all  customers  but  the  first  at  K-l 

Sl-K-1 

S=K-1 

S1(1)=0 

S(l)-0 

FLAG=0 

80  CALL  EVALUATE(S1 ,W1 ,C1) 

HIT-1 


! first  pass 

CALL  FATH0M(S1,S,W1,C1) 

HIT1-NIT 

! set  second  pass  to  start  each  arrival  1  slot  earlier 
Ithan  SI  if  FLAG=0  and  1  slot  later  if  FLAG=1 
DO  J=2,N 

S(J)=MIN(MAX(S1(J)-1+2*FLAG,0) ,K-1) 

S2(J)=S(J) 

END  DO 

SCO-0 

S2(l)=0 

S2(N+1)=S2(N) 

S(N+1)=S(N) 
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CALL  EVALUATE (S2,W2,C2) 

FLAG=1-FLAG 

NIT=1 

CALL  FATHOM (S2,S,W2,C2) 

NIT2=NIT 

! passes  may  not  coincide,  in  which  case 
! intermediate  schedules  must  be  enumerated 
CALL  ENUMERATE (SI , S2 , Cl , C2 , W1 , W2 , I , APARTFLAG , CHECKSUM) 
WRITE(6,*)NIT1 , ’  ’ ,NIT2 , ’  ’ ,NIT,’  ’ , CHECKSUM 

RETURN 

END 

I  *********************************** 


SUBROUTINE  FATHOM(SOPT,S,WOPT,COPT) 

(Uses  Simeoni-style  algorithm  to  find  early  bound  on  minimum  of 
! function  "cost".  Assumes  elements  1  and  N+l  are  fixed  at 
! bounds  0  and  K-l 

!FLAG  SOPT  is  early  schedule  if  FLAG=1,  late  schedule  if  FLAG=0 
!N  length  of  S  and  SOPT,  including  first  and  last  elements 
! K  number  of  schedule  slots 

! SOPT, COPT  early  schedule  (in  numbers  of  time  slots)  and  cost 
! s ,  C  current  test  schedule  (in  numbers  of  time  slots)  and  cost 
! DELTA  length  of  time  step 

!M  tracks  which  element  of  S  is  being  altered 
!NIT  number  of  cost  evaluations  performed 

USE  SETCOSTDATA 

REAL*8  C,C0PT,W(N+1) ,W0PT(N+1) 

INTEGER  M,S(N+1) , SOPT (N+l) 

!find  latest  customer  who  is  not  already  at  end  of  schedule 
100  M=N*FLAG+( 1-FLAG) *2  !M=N  or  M=2 
DO  WHILE  (S(M) .EQ. (K-1)*FLAG) 

M=M+1-2*FLAG  ! subtract  or  add  1 
END  DO 

IF(M.EQ.FLAG+( 1-FLAG)* (N+l))  GOTO  300  !all  arrivals  shifted 
200  S(M)=S(M)-1+2*FLAG  !try  to  shift  customer 
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IF(S(M+1) .GE.S(M) .AND.S(M) .GE.S(M-l))  GOTO  250  ! if  order  unchanged 

IF(M. EQ.N. AND. FLAG. EQ. 1)  GOTO  250  Hast  slot  can  be  shifted  forward 

S (M) =S (M) +1-2*FLAG  !no  good,  undo  shift 

M=M+1-2*FLAG  ! if  not  at  end,  try  to  shift  next  customer 

IF(FLAG*(M-2)+(l-FLAG)*(K-l-M) .GT.O)  GOTO  200 

GOTO  200 

250  CALL  EVALUATE (S.W.C) 

NIT=NIT+1 

IF(C.LT.COPT)  THEN  ! replace  SOPT  with  S 
S0PT=S 
C0PT=C 
W0PT=W 
GO  TO  100 

ELSE  !undo  shift  —  no  improvement 
S(M)=S(M)+1-2*FLAG 
M=M+1-2*FLAG  '.subtract  or  add  1 
IF (FLAG* (M-2) + (N-M) * ( 1-FLAG) . GE . 0)  GOTO  200 
END  IF 

300  CONTINUE 

RETURN 

END 

! ********************************** 


SUBROUTINE  ENUMERATE(S1,S2,C1,C2,W1,W2,I,APARTFLAG, 
1  CHECKSUM) 

! enumerates  all  schedules  between  SI  and  S2 


USE  SETCOSTDATA 

REAL*8  C,C1,C2,W(N+1) ,W1(N+1) ,W2(N+1) 

INTEGER  S(N+1) ,S1(N+1) ,S2(N+1) .APARTFLAG 
INTEGER  J , I ,H 

INTEGER  DIFFER (N), CHECKSUM 
NIT=0 

APARTFLAG=0 

! CHECKSUM  is  total  number  of  positions  current  schedule 
land  SI  differ  by 
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! generate  list  of  positions  in  which  SI  differs  from  S2 
1=0 

DO  J=2,N 

IF(S1(J).NE.S2(J))  THEN 

IF(ABS(Sl(J)-S2(J)).NE.l)  THEN 
APARTFLAG=1 
RETURN 
END  IF 
1=1+1 

DIFFER(I)=J 
END  IF 
END  DO 

Ipick  the  lowest  cost  of  SE  and  SL  to  start  enumeration 
(use  SI  as  starting  point  and  S2  as  current  optimum 
IFCC2.LT. Cl)  THEN 
C1=C2 
S1=S2 
W1=W2 

FLAG=1-FLAG 

ELSE 

C2=C1 
S2=S1 
W2=W1 
END  IF 

! go  through  all  possibilities  using  a  binary  counting  scheme 
S=S1 

CHECKSUMS 
DO  J=1 ,2**I-2 
H=1 

! next  schedule  in  binary  scheme 
CALL  FLIP (H , FLAG , DIFFER , S , S 1 , CHECKSUM) 

!no  need  to  evaluate  if  SE  or  SL  differ  by  just  one  place, 
! since  these  schedules  are  already  evaluated 
IF (CHECKSUM. EQ. 1. OR. CHECKSUM. EQ.I-1)  GOTO  200 

!no  need  to  evaluate  if  customers  have  changed  order 
! since  these  schedules  are  infeasible 
DO  H=1 ,N-1 

IF(SCH) .GT.SCH+l))  GOTO  200 
END  DO 

! schedule  must  be  evaluated 
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CALL  EVALUATE (S,W,C) 

NIT=NIT+1 
IF(C.LT.C2)THEN 
C2=C 
S2=S 
W2=W 
END  IF 

200  END  DO 

100  RETURN 
END 

!******************************* 

RECURSIVE  SUBROUTINE  FLIP(J, FLAG, DIFFER, S, SI, 

1  CHECKSUM) 

(flips  Jth  arrival  of  current  schedule.  If  the  position 

!is  the  same  as  that  in  the  starting  schedule,  it  calls  itself 

(recursively  to  flip  the  (J+l)st  customer 

USE  SETSIZE 

INTEGER  J,I,S(N+1) ,S1(N+1) ,DIFFER(N) .CHECKSUM .FLAG 

I=DIFFER( J) 

S(I)=S(I)+1-2*FLAG 
IF(ABS(S1(I)-S(I)).EQ.2)  THEN 
S(I)=S1(I) 

CALL  FLIP ( J+ 1 , FLAG , DIFFER , S , S 1 . CHECKSUM) 

CHECKSUM=CHECKSUM- 1 

ELSE 

CHECK SUM=CHECKSUM+1 
END  IF 
RETURN 
END 

| ******************************* 

SUBROUTINE  EVALUATE(TAU,W,COST) 


(compute  cost  and  waiting  time  vector  of  schedule  TAU. 

(Value  of  TAU(N+1)  is  modified  here  and  then  set  back  to  K-l. 
(overtime  is  the  waiting  time  of  a  fictitious  customer  at 
!TAU(N+1),  plus  time  elapsed  from  0TSL0T  to  TAU(N),  if  any. 
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!N  number  of  customers 
!R  number  of  phases  for  each  customer 
!NQ  used  dimensions  of  Q 
!E  EXP(Q*DELTA) 

! Q  input  matrix 

! ERROR  max  error  allowed  in  computation  of  exp(Q*DELTA) 

!TAU  arrival  vector  (schedule),  in  units  of  DELTA 
! GAMMA  prob  of  each  customer  showing 
! DELTA  schedule  lattice  size 

IAPHASE  number  of  phases  arrived  in  system  so  far 
!W  expected  wait  of  each  customer 
I0TSL0T  slot  in  which  overtime  begins 
! 0TSL0T2  =TAU(N)  if  TAU(N)>0TSL0T,  =0TSL0T  otherwise 
! OTPOINT  onset  of  overtime 

USE  SETQDATA 
USE  SETCOSTDATA 

INTEGER  TAU(N+1) ,1, J,DT,APHASE  ! .DISTFLAG 
REAL*8  WAITCOST , COST , PM (LDA) ,PP(LDA) , OVERCOST 
REAL*8  WW(H+1),W(N+1) 

! SAVE  DISTFLAG 

PM=0.0D0  !prob  vector  up  to  next  arrival 
PP=0.0D0  !prob  vector  just  after  arrival 
PM(1)=1.0D0 

APHASE=0  !sum  of  customers’  phases  so  far 

W(1)=0.0D0  ! individual  customers  waits,  assuming  they  showed 

WAITC0ST=0 . 0D0  ! total  wait 

WW=0.0D+0  ! tracking  variable  —  obsolete 

! Set  TAU(N+1)  to  facilitate  calculation  of  overtime 
TAU (N+l) =MAX ( OTSLOT , K- 1 ) 

DO  J-l.H 

W(J+1)=0 . 0D+0 

(place  all  probability  mass  beyond  APHASE  at  APHASE+1 
DO  I=APHASE+2,NQ 

PM ( APHASE+ 1 ) =PM (APHASE+ 1 ) +PM ( I ) 

PM(I)=0 . 0D+0 
END  DO 

(push  (1-GAMMA)  of  the  exit  probability  mass  to  the  exit  of 
(the  next  arrival  to  account  for  the  probability  of  a  no-show. 
PP=PM 
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PP ( APHASE+R (J)+l)=( 1 -GAMMA ( J ) ) *PM (APHASE+ 1 ) 
PP ( APHASE+ 1 ) =GAMMA (J)*PM ( APHASE+ 1 ) 

APHASE=APHASE+R ( J ) 


Ifind  probability  vector  at  TAU(J+1) 

DT=TAU(J+1)-TAU(J) 

CALL  PEXTEND (DT , PP , PM) 

Ifind  W(J+1) 

DO  1=1 , APHASE 

W(J+1)=W(J+1)+PM(I)*0MEGA( J+l ,1) 

IF ( I . GT . APHASE-R ( J ) )  WW(J+l)=WW(J+l)+PM(I)*OMEGA(J+l,I) 

END  DO 

! GAMMA (J)  factor  added  25Feb97  to  make  J's  wait  be  zero  if  no-show 
WA ITC0ST=W AITC  0  ST+W ( J ) *GAMMA ( J ) *  C W ( J ) 

END  DO 

!  if  0TP0INT>K-1,  then  TAU(N+1)  is  placed  at  OTSLOT, 
land  overtime  is  W(N+1) . 

I  otherwise,  TAU(N+1)  is  left  at  K-l, 

land  overtime  is  W(N+1)+(K-1-0TSL0T)*DELTA 

OVERCOST  =  (W (N+ 1 ) +MAX (0,TAU(N+1) -OTSLOT ) *DELTA) *CW ( N+ 1 ) 

C0ST=WAITC0ST+0VERC0ST 

I  reset  TAU(N+1)  for  use  as  a  bound  in  FATHOM 
TAU(N+1)=K-1 

!WRITE(2,*) (TAU(J) , J=1,N+1) .COST 


RETURN 

END 


i************************************** 

SUBROUTINE  PEXTEND (DT,PP, PM) 

I  finds  PM  =  PP*exp(Q*DELTA*(tau( J+l)-tau( J) ) ) 

I  accounts  for  change  in  state  between  times 
ITAU(J)  and  TAU(J+1) 

!PM(J)  state  probability  vector  just  before  J’s  arrival 
!PP(J)  state  probability  vector  just  after  J’s  arrival 
IDT  time  step  between  arrivals 
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USE  SETQDATA 
INTEGER  J , I , DT 

REAL*8  PP(LDA) ,PM(LDA) ,ET(LDA,LDA) 


'.initialize  ET  to  identity  matrix 
ET=0 . 0D0 
DO  J=1,NQ+1 

ET(J,J)=1.0D0 
PM( J)=0 . 0D0 
END  DO 

! ET=EXP (Q*DELTA*DT) 

DO  J=1 ,DT 

CALL  MATMULT (ET , E , NQ+1 , LDA) 

END  DO 

DO  J=1,NQ+1 

DO  1=1 , NQ+1 

PM( J)=  PM(J)+PP(I)*ET(I , J) 

END  DO 
END  DO 

RETURN 

END 

! ************************************** 

SUBROUTINE  GREEDYSEQ(ERR,ITER,NSOFAR) 

! tests  whether  greedy  algorithm  is  effective  when 
! started  from  two  sequences  (input).  It  compares 
(result  to  global  optimum  found  by  exhaustive 
! enumeration 

I  Greedy  Algorithm:  From  some  starting  sequence,  find 
!the  pairwise  swap  that  yields  the  greatest  cost 
(improvement,  if  any.  If  none,  optimum  is  reached. 
(Otherwise,  make  the  swap  and  repeat  the  process. 

USE  SETSEQDATA 

INTEGER  H,I, J ,NSOFAR,SAME(N,N) , FLAG , BADALG 
INTEGER  ITER 
CHARACTER* 10  TEMPC 
REAL*8  ERR 
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BADALG=0 

ITER=0 

DO  H=l,l  ! index  of  test  starting  sequence 
5  CALL  BUILDS AME ( ALPH ( NSOFAR) , N , SAME) 

FLAG=0 
ITER=ITER+1 
DO  1=1, N-l 

DO  J=I+1,N  !I  and  J  are  swap  indices 

IF(SAME(I, J) .EQ.O)  THEN  Ivalid  sequence... 

! TEMPC  is  ALPH (NSOFAR,*)  with  I  and  J  swapped 
DO  11=1, N 

TEMPC=ALPH (NSOFAR) 

END  DO 

TEMPC ( J : J)=ALPH (NSOFAR) (1:1) 

TEMPC (I : I)=ALPH(NSOFAR) ( J : J) 


!find  index  of  corresponding  sequence 
DO  11=1, NSEQ 

IF(ALPH(II).EQ. TEMPC)  GOTO  20 
10  END  DO 

WRITE (*,*)’ ERROR  IN  SWAP  ROUTINE’ 

WRITE (4,*)’ ERROR  IN  SWAP  ROUTINE’ 

20  CONTINUE  ! II  is  now  index  of  swapped  sequence 


IF(C(II) .LT. C (NSOFAR) )  THEN 
NS0FAR=II 
GOTO  5 
END  IF 

END  IF  Ivalid  sequence... 

END  DO  ! J 
END  DO  II 

ERR= (C (NSOFAR) -C (NBEST) ) /C(NBEST) 

30  END  DO  I H 

RETURN 

END 

j  ****************************** 
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SUBROUTINE  BUILDSAME(TEMPLATE,N,SAME) 

! generate  SAME,  where  SAME(I,J)=1  if  service 
! distributions  of  I  and  J  are  the  same. 

INTEGER  I, J,SAME(N,N) 

CHARACTER* 10  TEMPLATE 

SAME=1 
DO  1=1, N-l 
DO  J=I+1,N 

IF (TEMPLATE (1:1) .NE. TEMPLATE (J: J))  THEN 
SAME(I , J)=0 
SAME( J , I)=0 
END  IF 
END  DO  ! J 
END  DO  ! I 

RETURN 

END 

J  )|<*sf:*5|e5|(3|<5|«3|cJ|c5tc*s(c3»c*Jtc4:s|cstt)|e5f:s|c ******** 

SUBROUTINE  INPUT 

! Accepts  information  from  COXINPUT.TXT.  Format  of  this 
!file  is  described  in  its  comments 


USE  SETQDATA 
USE  SETCOSTDATA 
INTEGER  I , J , ERRCODE 
CHARACTER  A* 10 

OPEN  (1,  FILE=’ coxinput.txt') 
ERRC0DE=0 

READ  (1,*)  A 
READ  (1,*)  A 
READ  (1,*)  N 
IF (N . GT . MAXCUST)  THEN 
ERRC0DE=1 
GOTO  100 
ENDIF 

READ  (1,*)  A 

READ  (1,*)  (R(J) , J=1,N) 
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DO  J=1  ,N 

IF(R(J) .LT.O.OR.R(J) . GT.MAXPHASES)  ERRC0DE=2 
END  DO 

READ  (1,*)  A 
DO  J=1 ,N 

READ  (1,*)  (MU( J , I) , 1=1 ,R( J) ) 

DO  1=1, R(J) 

IF(MU(J,I) .LE.O)  ERRC0DE=3 
END  DO 
END  DO 

READ  (1,*)  A 
READ  (1,*)  A 
DO  J=1,N 

READ  (1,*)  GAMMA( J) , (B( J , I) ,  I-l.R(J)-l) 

IF (GAMMA (J) .GT.l.ODO. OR. GAMMA (J) .LT.O.ODO)  ERRC0DE=4 
DO  1=1 ,R(J)-1 

IF(B ( J , I) . GT . 1 . ODO . OR . B( J , I) . LT . 0 . ODO)  ERRC0DE=5 
END  DO 
END  DO 

GAMMA (N+l)=l. ODO 
READ  (1,*)  A 

READ  (1,*)  (CW( J) , J=1 ,N+1) 

100  IF (ERRCODE . NE . 0)  THEN 

SELECT  CASE  (ERRCODE) 

CASE(l) 

WRITE (*,*) ’ERROR  READING  NUMBER  OF  CUSTOMERS  -  MAX  EXCEEDED’ 
CASE(2) 

WRITE (*,*)’ ERROR  READING  NUMBER  OF  PHASES  -  MAX  EXCEEDED’ 
CASE(3) 

WRITE (*,*) ’ERROR  READING  PHASE  RATES’ 

CASE(4) 

WRITE (*,*) ’ERROR  READING  SHOW  PROBABILITIES’ 

CASE(5) 

WRITE(*,*) ’ERROR  READING  PHASE  PROBABILITIES’ 

CASE(6) 

WRITE (*,*) ’ERROR  READING  DELTA’ 

CASE  DEFAULT 

WRITE(*,*) ’UNSPECIFIED  ERROR’ 

END  SELECT 
END  IF 
CLOSE  (1) 

RETURN 
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END 


j  ********* **** **** ************* 


SUBROUTINE  BUILDQ 

! Builds  Q  matrix  from  data  in  COXINPUT.TXT.  Returns  NQ 
! Also  calls  OMEVAL  to  build  conditional  wait  matrix  OMEGA 
! modified  26feb97  to  incorporate  show  prob  into  Q 


USE  SETQDATA 
INTEGER  I,J 
Q=0 . 0D0 
NQ=0 


! LDA  max  size  of  Q. 

!NQ  index  of  current  row  of  matrix,  ends  as  size  of  Q.  Output. 

!MU  transition  rate  of  each  phase 
!B  routing  probability  to  next  phase. 

!note:  B(H,0)  is  show  rate  of  customer  H.  Not  accounted  for  in  Q. 

! determine  Q 
DO  J=1 ,N 

DO  I=1,R(J)-1 

Q (NQ+I , NQ+I )=-MU(J,I) *DELTA 
Q(NQ+I ,NQ+I+1)=B(J, I)*MU( J ,I)*DELTA 
IF(J.EQ.N)  THEN 

Q(NQ+I ,NQ+R( J)+1)=(1_B( J, I))*MU(J , I)*DELTA 
ELSE 

Q(NQ+I,NQ+R( J)+1)=GAMMA(J+1)*(1-B(J,I))*MU(J , I)*DELTA 
Q(NQ+I,NQ+R(J)+R(J+1)+1)= 

1  (1-GAMMA( J+l) )*(1-B(J , I))*MU( J , I)*DELTA 

END  IF 
END  DO 
NQ=NQ+R(J) 

Q ( NQ , NQ ) =-MU ( J , R ( J ) ) *DELTA 
Q(NQ,NQ+1)=MU(J,R(J))*DELTA 
DO  1=1, NQ 

IF(J.NE.N)THEN 

Q(I ,NQ+1+R(J+1) )=(1-GAMMA( J+l) )*Q(I,NQ+1) 
Q(I,NQ+1)=GAMMA(J+1)*Q(I,NQ+1) 

END  IF 
END  DO 
END  DO 
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! append  a  last  row  of  zeros  to  Q  (exit  state) 

NQ=NQ+1 

RETURN 

END 

j  *********************************** 

SUBROUTINE  OMEGAEVAL 


! evaluates  conditional  waiting  matrix  OMEGA. 

! 0MEGA( J , I)  is  the  expected  waiting  time  for  the  Jth 
! customer,  given  the  current  phase  is  I,  and  assuming 
! all  customers  show  and  are  immediately  available 

USE  SETQDATA 

INTEGER  J.IN.IM.IQ 

REAL* 8  SVC (MAXCUST+ 1 , LDA) 

0MEGA=0 . 0D0 
SVC=0 . 0D0 

'.define  SVC(J,I),  the  expected  service  of  customer  J,  given 
!the  system  is  in  the  customer’s  Ith  phase  of  service 
DO  J=1 ,N 

SVC( J ,R( J) )=1/MU( J ,R( J) ) 

DO  I=R(J)-1,1,-1 

SVC(J,I)=SVC(J,I+1)*B(J,I)+1/MU(J,I) 

END  DO 
END  DO 

! find  OMEGA (J , IQ) ,  J’s  expected  wait,  given  the  current  state  is  IQ 
IQ=0 

DO  IN=1,N  !  customer  index 

DO  IM=1 ,R(IN)  !  customer’s  stage  index 
IQ=IQ+1  !  current  state  index 

0MEGA(IN+1,IQ)=SVC(IN,IM)  !  add  partial  svc  of  current  customer 
DO  J=IN+2,N+1  !  customer  # 

! add  full  (possible)  svc  of  other  customers  (recursively) 
OMEGA (J , IQ)=0MEGA( J-l ,IQ)+GAMMA( J-1)*SVC( J-l , 1) 

END  DO 
END  DO 
END  DO 

RETURN 

END 
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I  ********************************** 


SUBROUTINE  EXP  (ERROR) 

[Employs  a  scale-and  square  algorithm  to  create  a  matrix  that  has 
! a  smaller  determinant,  using  a  Taylor  series  approximation  to  obtain 
!exp(Q/M) ,  then  taking  that  matrix  to  the  mth  power  to  get  exp(Q) . 

!NP  is  the  smallest  power  of  2  that  ensures  det (Q*DELTA/NP)  <  1. 
[Stopping  criteria  are  max  #  terms  in  Taylor  series  (100)  or  Liou,s 
[criterion,  based  on  the  2-norm  (Proc  IEEE,  54(1966)20-23),  whichever 
!is  satisfied  first. 

[Only  the  exponential  of  the  P  matrix  (Q  matrix  minus  the  last  row 
[and  column)  is  calculated,  after  which  the  last  row  and  column  are 
[computed  and  appended,  allowing  slight  savings  in  time  and  accuracy. 

!  Q  input  matrix 
!  QM  Q  scaled  by  F 

[  QT  current  term  of  Taylor  series  of  QM 
!  F  scaling  factor 

[  NORM  Frobenius  norm.  Should  actually  be  2-norm,  but  F-norm 
!  is  always  larger,  so  this  is  conservative 
[  ERROR  allowable  error  in  scaled  matrix  values 

USE  SETQDATA 
INTEGER  I, J,H 

REAL*8  QM (LDA , LDA) , NORM , QT (LDA , LDA) , FACT , ERROR 
! initialize 
DO  I-l.NQ-l 
DO  J=1 ,NQ-1 

QM(I,J)=Q(I,J) 

QT(I,J)=0.0D0 
E(I , J)=0 . 0D0 
END  DO 

QT(I,I)=1.0D0 
E(I , I)=l . 0D0 
END  DO 

[find  the  Frobenius  norm 
N0RM=0.D0 
DO  1=1, NQ 
DO  J=1 ,NQ 

N0RM=N0RM+Q(I, J)**2 
END  DO 
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END  DO 

NORM=SQRT (NORM) 


!NP  is  min  #  halvings  needed  to  scale  DET(P) . 

! F  is  smallest  power  of  2  larger  than  F-norm  of  Q 
NP=MAX(0 , INT (LOG ( ABS (NORM) ) /LOG (2 . 0D0) +1 ) ) 

F=2 . 0D0**DBLE(NP) 

N0RM=N0RM/F 


! scale  QM 
DO  1=1 ,NQ-1 
DO  J=1 ,NQ-1 

QM(I,J)=QM(I,J)/F 
END  DO 
END  DO 

FACT=1 . 0D0 
DO  H=1 , 100 
!  WRITE(1,*) 

!  WRITE (1,*) ’TERM  ’ ,H 

CALL  MATMULT(QT)QM,NQ,LDA) 

DO  1=1 ,NQ-1 
DO  J-l.NQ-l 

QT(I,J)=QT(I,J)/H 

E(I,J)=E(I,J)+QT(I,J) 

END  DO 

!  WRITE(1,10)(QT(I,J),J=1,NQ) 

END  DO 

FACT=FACT* (H+l) 

IStandish’s  stopping  condition 

IF ( (N0RM*2) **H/ (2*FACT) . LT . ERROR)  GOTO  20 


ILiou’s  stopping  condition 

IF (NORM* (H+2) / ( (H+2-N0RM) *FACT) . LT . ERROR)  GOTO  20 
END  DO 
WRITE(2,*) 

WRITE(2,*) ’Warning:  exp(Q*DELTA)  may  not  be  accurate.’ 

WRITE(2,*)  ’100  terms  of  the  Taylor  series  were  used  without  achieving 
1  the  stopping  condition’ 

10  F0RMAT(<NQ>E9 .2) 

!E  is  now  exp  of  the  scaled  P  matrix. 

! First,  unscale  by  squaring  E  NP  times. 

20  DO  1=1, NP 

CALL  MATMULT (E,E,NQ,LDA) 
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END  DO 


!now  add  last  column  and  last  row  to  complete  exp(Q*DELTA) 

DO  1=1 ,NQ-1 

E(I,NQ)=1.0D0 
DO  J=1 ,NQ-1 

E(I,NQ)=E(I,NQ)-E(I,J) 

END  DO 

E(NQ , I)=0 . 0D0 
END  DO 

E(NQ,NQ)=1.0D0 

RETURN 

END 

I  ************************** 

SUBROUTINE  MATMULT(A,B,N,LDA) 

! calculates  A=A*B ,  where  A,B  are  upper  triangular  and  real. 

!N  is  order  of  the  used  matrices. 

!Use  of  triangular  mult  requires  only  1/6  the  number  of  flops 
las  does  full  mult  (see  Golub+Van  Loan,p  18) 

INTEGER  I, J,H,N 

REAL*8  A (LDA , LDA) , B (LDA , LDA) , C (LDA , LDA) 

C=0.0D0 

DO  1=1, N 
DO  J=I,N 

!C(I,J)  is  product  of  Ith  row  of  A  and  Jth  column  of  B 
DO  H=I , J 

C(I,J)=  C(I,J)+A(I,H)*B(H,J) 

END  DO 
END  DO 
END  DO 


A=C 


RETURN 

END 


I  ************************** 
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SUBROUTINE  FPFLAW 

! checks  for  original  Pentium  floating  point  bug 
! borrowed  from  Microsoft. 

! cf :  FORTRAN  PowerStation  Programmer’s  Guide,  pp550-l 


REAL  0P1.0P2 

COMMON  /DIVIDECHECK/  0P1.0P2 

DATA  0P1  /3145727 . 0/ , 0P2  /4195835.0/ 

IF (0P2/0P1.LE. 1.3338)  THEN 

WRITEO,*)’  WARNING:  PENTIUM  FLAW  DETECTED.  THERE  IS  A  SMALL’ 
WRITEO,*)’  CHANCE  THE  PROGRAM  WILL  GIVE  INCORRECT  RESULTS’ 

WRITE (2,*)’  WARNING:  PENTIUM  FLAW  DETECTED.  THERE  IS  A  SMALL’ 
WRITE(2,*) ’  CHANCE  THE  PROGRAM  WILL  GIVE  INCORRECT  RESULTS’ 

WRITE (*,*) ’  CONTACT  INTEL  AT  1-800-628-8686  FOR  MORE  INFORMATION’ 
WRITE (2,*) ’  CONTACT  INTEL  AT  1-800-628-8686  FOR  MORE  INFORMATION’ 
END  IF 


RETURN 

END 


H.2  Input  Files 

This  section  describes  the  use  of  and  gives  examples  of  the  input  files  used,  as 
well  as  the  program  used  to  generate  the  SEQUENCE  file. 


-  DISTRIB  (page  227):  Input  file.  Tabulates  Coxian  parameters  for  a  number 
of  3-moment  sets  (mean-normalized).  Accessed  by  HCSEARCH. 

-  COXINPUT  (page  227):  Input  file  for  Coxian  parameters.  Not  used  in  this  par¬ 
ticular  version,  since  Coxian  parameters  were  generated  randomly.  Accessed 
by  subroutine  INPUT. 

-  SEQUENCE  (page  228):  Input  file.  Enumerates  all  the  sequences  to  be  eval¬ 
uated  in  the  search  for  the  global  optimum.  Accessed  by  HCSEARCH. 

-  PERMUTE  (page  228):  Program  to  generate  permutations  of  customers.  It 
produces  the  file  SEQUENCE  in  cases  where  all  permutations  are  to  be  tested. 
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Input  file  DISTRIB 


phi2  ,  phi3,  #Erlang  phases, 
assumes  first  moment  is  1.0 

Coxian  rate, 

Erlang  rates 

!,  trans 

4.0 

6.0 

0 

1.0 

1.0 

0.0 

1.5 

3.0 

1 

2.0 

2.0 

1.0 

1.25 

1.87 

3 

4.0 

4.0 

1.0 

3.00 

11.3 

3 

105.2 

1.329 

0.4388 

5.0 

31.3 

3 

425.5 

0.7989 

0.2657 

3.00 

67.4 

1 

1.029 

0.05397 

0.00154 

5.0 

188.0 

1 

1.095 

0.05468 

0.00475 

0.0 

0.0 

0 

0.0 

0.0 

0.0 

i ************************************ 

Input  file  COXINPUT 

INPUT  DATA.  UNFORMATTED,  BUT  SOME  LINES  ARE  RESERVED  FOR  COMMENTS 

#  Customers 
5 

#  phases  for  each  customer 

8  8  8  8  8 

phase  rates  mu(j,k),  cust  indexed  by  rows(j),  phases  indexed  by  columns(k) 

0.360  0.360  0.360  0.360  0.360  0.360  0.360  0.360 

0.360  0.360  0.360  0.360  0.360  0.360  0.360  0.360 

0.308  0.308  0.308  0.308  0.308  0.308  0.308  0.308 

0.308  0.308  0.308  0.308  0.308  0.308  0.308  0.308 

0.308  0.308  0.308  0.308  0.308  0.308  0.308  0.308 

transition  probs  b(j,k),  cust  indexed  by  rows,  phases  indexed  by  columns. 
b(j ,0)  is  gamma(j) ,  the  show  probability 
0.95  0.984  1.0  1-0  1.0  1.0  1.0  1.0  1.0  1.0 

0.95  0.984  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 

0.92  0.982  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 

0.92  0.982  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 

0.92  0.982  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 

cost  coefficients  2  through  N+l 
1.0D+00  1.0D+00  1.0D+00  1.0D+00  1.0D+00 


|  ************************************ 
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Input  file  SEQUENCE 

ALL  3-CUSTOMER  SEQUENCES 
12  3 

1  3  2 

2  13 

2  3  1 

3  12 

3  2  1 

0  0  0 


j  Jit************************************** 


PROGRAM  PERMUTE 

! program  to  create  a  fiie  of  permutations 
!for  use  in  validating  sequencing  algorithms 
! in  program  HCSEARCH 


INTEGER  I, N, PERM (400000, 9) ,PERM1 (400000,9) ,NPERM 
CHARACTERS  ALPH 
OPEN (1,FILE=’ OUT') 

ALPH=‘ 123456789’  !put  items  to  be  permuted  here 
WRITE (*,*) ‘Input  number  of  objects  to  be  permuted:’ 
READ(* ,*)N 
PERM1 (1 ,  1)=1 
NPERM=1 
1=1 

CALL  PERMADD ( N , I , NPERM , PERM 1 , PERM) 


WRITE(1,10) ‘All  the  permutations  of  ’,N,‘  customers’ 
10  FORMAT (A24, 12, A10) 

DO  WHILE (PERM (1,1) .GT.0) 

WRITE(1,*) (ALPH(PERM(I, J) :PERM(I, J)) , ‘  ’,J=1,N) 
1=1+1 
END  DO 


(append  a  row  of  zeros  for  use  in  HCSEARCH 
WRITE(1,*)  (‘0  ’ ,I=1,N) 

WRITE(*,*)I-1, ‘TOTAL  PERMUTATIONS  CALCULATED’ 

CLOSE (1) 

END 
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i******* *********** ********* 


RECURSIVE  SUBROUTINE  PERMADD (N ,  I , NPERM , PERM1 , PERM) 

! takes  each  of  the  I!  perroutations  of  I  customers 
land  inserts  the  (I+l)st  customer  at  every  possible 
! point  to  create  all  permutations  of  1+1  customers 


INTEGER  COUNT ,N,I, NPERM , PERM (400000 , 9) , PERM1 (400000 , 9) , J , K 
C0UNT=0 
1=1+1 

DO  J-l, NPERM 
DO  K=1 , I 

C0UNT=C0UNT+1 
DO  M-l.K-l 

PERM(C0UNT,M)=PERM1(J,M) 

END  DO 

PERM (COUNT, K)=I 
DO  M-K+1,1 

PERM (COUNT , M) =PERM1 ( J , M- 1 ) 

END  DO 
END  DO 
END  DO 

DO  J=l, COUNT 
DO  K=1 , I 

PERM1(J,K)=PERM(J ,K) 

END  DO 
END  DO 

IF(I.LT.N)  CALL  PERMADD (N, I, COUNT .PERM1, PERM) 

RETURN 

END 


H.S  Alternative  Matrix  Exponentiation  Routines 

The  following  routines  are  alternatives  to  the  matrix  exponentiation  routine 
EXP  (page  223  in  the  previous  section).  The  algorithms  provide  mathematically 
exact  results,  but  when  they  are  applied  using  floating-point  arithmetic,  they  may 
produce  substandard  results  for  the  exponentiations  required  in  schedule  evaluations. 
They  are  included  here  only  to  support  the  discussion  in  Appendix  G. 
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CAYHAM  (page  230):  Cayley-Hamilton  algorithm  discussed  in  Section  G.3. 
Calls  subroutine  EIGEN VAL  (page  233). 

PARLETT  (page  234):  Parlett’s  algorithm  discussed  in  Section  G.5. 


SUBROUTINE  CAYHAM(Q2,E2,LDA,NQ,EMAX) 

! Finds  the  exponential  of  an  NxN  matrix  using  Cayley-Hamilton 
! theorem.  Allows  eigenvalues  to  be  complex,  but  that  capability 
! isn’t  needed  for  this  dissertation.  The  matrix  is  required  to  be 
! real  here,  but  that  requirement  can  be  relaxed  by  changing  types 
! in  all  routines.  The  program  can  also  be  used  to  find  other 
! well-defined  functions  of  a  matrix,  simply  by  changing  the 
(definition  of  B. 

(Errors  in  this  approach  are  almost  entirely  due  to  floating  point 
(truncation  of  very  large  contributing  terms  which  nearly  cancel 
(each  other  (Van  Loan’s  ‘ ’ catastrophic  cancellation"). 

(Therefore,  error  is  estimated  by  comparing  the  size  of 
(the  absolute  value  of  the  largest  contributing  term  to  the  final 
(result  in  each  element  of  the  computed  exp(Q).  Double  precision 
(arithmetic  allows  for  only  15  decimal  places  of  accuracy,  so  if 
(the  ratio  is  larger  than  1E13,  there  is  the  possibility  that  the 
(result  has  fewer  than  two  decimal  places  of  accuracy. 


EIGVAL 

EIG 

MULT 

NEIG 

LDA 

NEQ 

E 

EMAX 

ETERMS 


the  eigenvalues  of  Q 

the  distinct  eigenvalues  of  Q 

the  multiplicity  of  each  eigenvalue  in  EIG 

the  number  of  elements  in  EIG 

the  dimensions  of  the  matrix  storage  spaces 

used  dimensions  of  matrices 

EXP(Q) 

max  of  the  abs  value  of  the  terms  E  comprises 
the  matrix  of  abs  values  of  the  terms  E  comprises 


(Microsoft-specific:  invokes  IMSL  link 

USE  MS IMSL 


INTEGER  I , J , K , UPTRI , NEIG , MULT (LDA) , FACT (0: LDA) , IPATH , NQ 

COMPLEX (8)  EIGVAL (LDA) , EIG (LDA) , A(LDA) ,C (LDA, LDA) ,B(LDA) ,  E (LDA, LDA) 

REAL*8  Q (LDA, LDA),  QP (LDA, LDA),  TEMP  ,TEMP2 

REAL* 8  E2 (LDA , LDA) , Q2 (LDA , LDA) , EMAX (LDA, LDA) 
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! initialize 
DO  1=1, LDA 
DO  J=1 ,LDA 

Q(I, J)=Q2(I, J) 

C(I, J)=0.00D+00 
E(I , J)=0 . OOD+OO 
EMAX (I , J) =0 . OOD+OO 
QP(I , J)=0 . OD+OO 
END  DO 
END  DO 

!tell  EIGENVAL  routine  that  Q  is  upper  triangular,  get  eigenvalues 
UPTRI=1 

CALL  EIGENVAL (Q , NQ , LDA , UPTRI , EIGVAL) 


! determine  list  of  distinct  eigenvectors  and  their  multiplicities 
EIG (1) -EIGVAL (1) 

NEIG-1 
MULT(1)=1 
DO  1=2, NQ 

DO  J=1 ,NEIG 

IF(EIGVAL(I) .EQ.EIG(J))  THEN 
MULT( J)=MULT( J)+l 
GOTO  10 
END  IF 
END  DO 
NEIG=NEIG+1 
MULT(NEIG)=1 
EIG(NEIG)=EIGVAL(I) 

10  END  DO 


! calculate  factorials  needed 

FACT(0)=1 

DO  1=1, NQ 

FACT(I)=FACT(I-1)*I 
END  DO 

! calculate  C  and  B  in  B=CA,  where  A  are  the  polynomial  coefficients 
! in  exp(Q)=A(0)+A(l) *Q**1+A(2)*Q**2+. . .+A(N)*Q**N 
! A  and  B  may  be  complex  if  q  isn’t  real  or  if  it  isn’t  triangular 
! NEQ=0 
DO  I-l.NEIG 

DO  K=1 ,MULT(I) 

NEQ=NEQ+1 
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DO  J-K-l.NQ-l 

C(NEQ , J+1)=EIG(I) **( J-K+l) *FACT ( J ) /FACT ( J-K+ 1 ) 

END  DO 

! if  another  (well-defined)  function  of  Q  other  than  exp(Q)  is 
!to  be  calculated,  only  the  following  statement  need  be  changed 
!to  reflect  the  derivative  of  order  K-l  with  respect  to  Q  of 
!that  function 
B (NEQ) =EXP (EIG (I) ) 

! WRITE (1,*)  ’B(’ ,NEQ, ’)  =  ’ ,B(NEQ) 

!WRITE(1,*)  (C(NEQ,J) ,J=1,NQ) 

! WRITE (1,*) 

END  DO 
END  DO 

! solve  B=C*A  for  vector  A  using  IMSL  routine 
IPATH=1 

CALL  DLSACG (NQ , C , LDA , B , IPATH , A) 

! WRITE(1 ,*) ’A  Coefficients:’ 

!WRITE(1,*) (A(I) ,I=1,NQ) 

! continue  to  add  polynomial  terms  to  E 

! inefficient,  but  we  need  to  track  magnitudes  of  each  contribution 
!to  each  element  to  ensure  the  range  is  not  too  great  for  accuracy 
DO  1=1, NQ 

E(I,I)=A(1) 

QP(I,I)=1.00D+00 
END  DO 

DO  I-l.NQ-l 

CALL  MATMULT (QP , Q , NQ , LDA) 

DO  J=1 ,NQ 
DO  K=1,NQ 

TEMP=ABS(A(I+1))*QP(J,K) 

EMAX ( J ,K)=MAX (TEMP , EMAX ( J , K) ) 

E(J ,K)=E( J,K)+TEMP 

IF( J . EQ. 1 . AND.K.EQ .4) WRITE(1 ,*)TEMP,E(J ,K) 

END  DO 
END  DO 
END  DO 

! if  error  analysis  not  needed,  the  following  is  more  efficient 
! it  requires  a  type  change  in  MATMULT ,  though 
(calculate  E=A(0)+A(1)E+ . . .+A(NQ)E**NQ 
(recursively  calculates  E=E*Q+A(J)* IDENTITY 
(DO  I-NQ-1,1,-1 
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! calculates  E=E*Q 
! CALL  MATMULT (E , Q ,  NQ , LDA) 

IDO  J=1 ,NQ 

!E(J, J)=E(J, J)+A(I) 

! END  DO 
! END  DO 

20  F0RMAT(<NQ>E10 . 2) 

!use  ratio  of  E  to  EMAX  to  see  if  there  were  massive  cancellations 
TEMP=1 . E30 
WRITE (1,*) 

WRITE ( 1 ,  * ) 

DO  1=1, NQ 
DO  J=1 ,NQ 

TEMP2=ABS(E( I , J ) + 1 . E-30) /ABS (EMAX ( I , J ) +1 . E-30) 

TEMP=MIN (TEMP , TEMP2) 

IF(TEMP2.LE. IE- 13)  THEN 

WRITE(1,*)I, J, ’  Some  elements  of  EXP(Q)  had  a  very  large 
1  cancellation.’ 

WRITE (1,*) ’There  may  be  an  accuracy  problem  as  a  result.’ 

GOTO  40 
END  IF 
END  DO 
END  DO 

Ipass  back  REAL*4  representation  of  E 
40  DO  1=1, NQ 

DO  J=1 ,NQ 

E2(I, J)=REAL(E(I, J)) 

END  DO 
END  DO 

RETURN 

END 


I  ************************************ 

SUBROUTINE  EIGENVAL(Q,NQ,LDA,UPTRI,EIGVAL) 

! Finds  eigenvalues  of  Q,  either  using  IMSL  routine  or,  in  the  cases 
lof  interest  to  this  dissertation,  since  Q  is  upper-triangular, 

! simply  returns  diagonal  of  Q. 

INTEGER  LDA,I,UPTRI,NQ 
REAL*8  Q (LDA, LDA) 
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COMPLEX (8)  EIGVAL(LDA) 


!find  eigenvalues 
IF(UPTRI.EQ.l)  THEN 
DO  1=1 ,LDA 

EIGVAL(I)=Q(I,I) 

END  DO 
ELSE 

CALL  DEVLRG (NQ , Q , LDA , EIGVAL) 

END  IF 


!WRITE(*,*) ’Eigenvalues:  ’ , (EIGVAL(I) , 1=1 ,N) 


RETURN 

END 


SUBROUTINE  PARLETT(Q2,E2, LDA, NQ, ERROR) 

! Employs  Parlett’s  method  to  obtain  exp(Q/m).  Since  that  method  does 
!not  allow  for  confluent  eigenvalues,  a  small  amount  (ERROR)  is  added 
lor  subtracted  from  each  confluent  eigenvalue  to  ensure  they  are 
! distinct.  If  there  are  N  confluent  eigenvalues,  this  routine  causes 
! them  to  differ  by  between  0  and  INT(N/2)*2E.  Once  the 
I  eigenvalues  are  adjusted,  the  entire  row  is  scaled  accordingly. 

LDA  the  dimensions  of  the  matrix  storage  spaces 

NQ  used  dimensions  of  matrices 

E  EXP(Q) 

Q  input  matrix 

ERROR  adjustments  to  eigenvalues 

ICOUNT  counts  number  of  eigenvalues  identical  to  current  test 

FLAG  flags  possibly  catastrophic  cancellations 

TMAX  max  term  of  the  element  of  E  currently  being  calculated 

INTEGER  I , J , NQ , LDA , K , FLAG 

REAL*8  Q(NQ,NQ) ,E(NQ,NQ) ,TMAX,  TERM,  ERROR 

REAL*4  Q2 (LDA , LDA) , E2 (LDA , LDA) 

! initialize 
FLAG=0 
DO  1=1 ,NQ-1 
DO  J=1 ,NQ-1 
Q(I,J)=Q2(I,J) 
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E(I , J)=0 . OE+OO 
END  DO 
END  DO 


DO  1=1 ,NQ-2 
ICOUNT=l 
DO  J=I+1 ,NQ-1 

IF(Q(I,I) . EQ.Q(J, J))THEN 
ICOUNT=ICOUNT+l 

C= 1+ ( - 1 ) * * ICOUNT* INT ( I COUNT/ 2) *ERROR 
DO  K=J,NQ-1 

Q(J,K)=Q(J,K)*C 
END  DO 
END  IF 
END  DO 
END  DO 


DO  1=1 ,NQ-2 
DO  J=I+1,NQ-1 

IF(Q(I,I) .EQ.Q(J, J))THEN 

WRITEd  ,*) ’This  is  one  of  the  unlikely  cases  in  which  the 
1  eigenvalue  dithering  routine  produced  two  ’ 

WRITE(1 ,*) ’or  more  confluent  eigenvalues.  Suggest  you  ’ 

1  restart  the  routine  after  changing  ERROR  slightly.’ 

STOP 
END  IF 
END  DO 
END  DO 

WRITEd,*)  ’P:  ’ 

DO  J=l,  NQ-1 

WRITE(1 , 10) (Q(J,I) »  1=1, NQ-1) 

END  DO 

10  FORMAT  (<NQ-1>E13. 6) 

!now  that  eigenvalues  are  distinct,  we  can  use  Parlett’s  algorithm 
DO  1=1, NQ-1 

E(I,I)=EXP(Q(I,I)) 

END  DO 


DO  I=l,NQ-2  ! #  diagonals  from  main  diagonal 
DO  J=1,NQ-1-I  !#  row.  #  column  is  I+J 
E( J , I+J)=Q( J, I+J)*(E(I+J ,I+J)-E(J , J) ) 
TMAX=E(I, J) 

DO  K=J+1 , I+J-l 
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TERM=Q(J,K)*E(K,I+J)-E(J,K)*Q(K,I+J) 
TMAX=MAX (TMAX , ABS (TERM) ) 

E( J , I+J)  =  E(J,I+J)+TERM 
END  DO 

IF(TMAX/E( J , I+J) . GT. 1 . OE+13)  FLAG=1 
E( J , I+J)=E( J, I+J)/(Q(I+J, I+J)-Q( J, J) ) 
END  DO 
END  DO 


! establish  last  row  and  column  of  E 
DO  1=1 ,NQ-1 
E(I ,NQ)=1 . 0D0 
DO  J=I , NQ-1 

E(I,NQ)=E(I,NQ)-E(I,J) 

END  DO 

E(NQ,I)=0.0D+00 
END  DO 

E(NQ,NQ)=1.0D0 

Imust  return  E  as  REAL*4 
E2(I, J)=E(I, J) 

IF (FLAG. EQ . 1)  THEN 
WRITE(1,*) 

WRITE (1 ,*) ’Warning:  one  or  more  elements  of  EXP(Q)  had  potentially 
1  catastrophic  cancellation.  ’ 

WRITE(1,*)  Accuracy  of  EXP(Q)  is  questionable.  Try  re-running  with 
1  larger  ERROR  if  feasible.’ 

END  IF 

RETURN 

END 
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H.4  Moment  Matching  Routine 


This  Microsoft  Excel™  spreadsheet  is  used  to  determine  the  parameters  r, 
w,  and  b  for  the  Cox-plus-Erlang-r  distribution  that  matches  the  desired  first  three 
(noncentral)  moments,  mi,  m2)  and  m3,  as  discussed  in  Section  F.9.  It  uses  Ex¬ 
cel’s  resident  NLP  to  perform  a  line  search  to  find  the  best  value  of  w.  The  other 
parameters  are  determined  analytically. 


A 

— - - B 

— C 

1 

Find  a  Cox+Erlang-r  distribution  which 

2 

matches  the  first  3  moments 

~r 

Change  only  these  moments: 

ml 

i 

m2 

3 

~T 

m3 

10 

~~ g- 

phi2 

=B6/B5**2 

MU" 

phi3 

=B7/B5**3 

TT 

=IF(B10/B9**2.ge.l  “PROBLEM  IS  FEASIBLE” , “NOT  FEASIBLE!”) 

~TT 

TIT 

1 

=IF((A13+2)/(A13+l).lt.B10/B9**2,l,0) 

=A13*B13 

14 

2 

=IF(AND((A14+2)/(A14+l).lt.B10/B9**2,(A14+l)/(A14).ge.B10/B9**2),l,0) 

= A14*B14 

“13" 

3 

=IF(AND((A15+2)/(A15+l).lt.B10/B9**2,(A15+l)/(A15).ge.B10/B9**2),l,0) 

=A15*B15 

TT 

4 

=IF(AND((A16+2)/(A16+l).lt.B10/B9**2,(A16+l)/(A16).ge.B10/B9**2),l,0) 

=A16*B16 

"TT 

5 

=IF(AND((A17+2)/(A17+l).lt.B10/B9**2,(A17+l)/(  A17).ge.B10/B9**2),l,0) 

=A17*B17 

MET 

6 

=IF(AND((A18+2)/(A18+l).lt.B10/B9**2,(A18+l)/(A18).ge.B10/B9**2),l,0) 

=A18*B18 

MS- 

7 

=IF(AND((  A19+2)/(Al9+l).lt.Bl0/B9**2,(A19+l)/(  A19).ge.B10/B9**2),l,0) 

=A19*B19 

~21T 

8  ! 

=IF(AND((A20+2)/(A20+l).lt.B10/B9**2,(A20+l)/(A20).ge.B10/B9**2),l,0)  H 

=A20*B20 

~TT 

9 

=IF(  AND((A21+2)/(A21+l).lt.B10/B9**2,(A21+l)/(A21).ge.B10/B9**2),l,0) 

=A21*B21 

~rr 

10 

=IF(AND((A22+2)/(A22+l).lt.B10/B9**2,(A22+l)/(A22).ge.B10/B9**2),l,0) 

=A22*B22 

“23“ 

To  obtain  desired  coefficients  w  and  b,  use  Excel's 

26 

solver  to  minimize  object  wrt  w.  You  may  have 

27 

to  change  the  sign  ot  zeta  to  get  a  feasible  answer 

— IF(B31  =  0,“whoa!  r  must  be  larger  than  expected.  Expand  table”,"  ”) 

w 

0.1 

ejt 

rr 

=SUM(C13:C22) 

zeta 

=-SQRT((rr+l)**2+4*w*(rr+l)*(l-phi2)+4*w**2) 

“33" 

bb 

=  (2*w*(l-phi2)+rr+l+zeta)/(2*phi2*rr) 

phi3 

_(6*w**3+6*bb*rr*w**2+3*bb*rr**2*w 

+  3*bb*rr*w+bb*rr**3+3*bb*rr**2+2*bb*rr)/(w+bb*rr)**3 

"35" 

object 

=(B34-B10)**2 
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H.  5  Scheduling  Simulation  Code 

The  following  FORTRAN  90  code  was  used  in  Appendix  E  to  simulate  the 
assignment  of  a  combination  to  each  day.  It  is  assumed  that  six  customers  request 
appointments  each  day,  each  customer  belonging  to  one  of  three  classes.  Arrival 
probabilities  are  specified  for  each  class.  For  each  day,  a  combination  must  be  cho¬ 
sen.  A  combination  allots  six  appointments  on  a  given  day,  each  one  being  dedicated 
to  one  class  of  customer.  The  measures  of  merit  of  such  a  system  include  the  average 
system  cost,  the  number  of  unfillable  appointments,  the  average  delay  between  pa¬ 
tient  arrival  and  appointment,  and  the  number  of  unacceptable  delays.  The  system 
is  complex  enough  to  require  simulation  for  all  but  the  simplest  cases. 


PROGRAM  COMBINATION 

!N0W:  current  day  of  customer  to  be  scheduled 
!N0W2:  nonvolatile  copy  of  NOW 
IDAYNOW:  what  day  it  is 

! LASTFIXED :  the  last  day  of  schedules  already  fixed 
!P0SS:  the  possible  schedules  for  each  day 
!C0ST:  cost  of  each  poss 

! SCHED:  actual  schedule  each  day  -  volatile 
! SCHED2 :  nonvolatile  copy  of  SCHED 

!WAIT(I):  number  of  customers  who  had  to  wait  I  for  appointment 
!0FF(I):  how  much  excess  schedule  capacity  for  customer  I 
! SCORE:  current  rating  of  each  possible  candidate  combination 

INTEGER  NOW (3) ,DAYN0W, SCHED (1000 ,3) ,SUM,N0W2(1000,3) 

INTEGER  POSS (5, 3), LASTFIXED,  TOTAL (3) , OFF (3) , DAY , TCOST 
INTEGER  SCHED2C1000) , WAIT(0 : 25) , NUMDAYS , TEMP 
REAL  R , COST ( 5 ) , SCORE ( 5 ) , MAXWAIT 
OPEN ( 1 , FILE= ’ COMBIN . DAT ’ ) 

OPEN ( 2 , FILE= ’ OUT ’ ) 

WRITE (*,*) ’RANDOM  SEED: ’ 

READ(*,*)J 

WRITE (2,*) ’RANDOM  SEED:  \J 
R=RAND ( J ) 

WRITEC*,*) ’How  many  days  to  schedule?’ 
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10 


READ (*,*)NUMD AYS 

LASTFIXED=0 

WAIT=0 


!read  in  combination  candidates 
DO  J=1 , 5 

READ(1,*) (P0SS(J,I) ,1=1,3) ,C0ST(J) 
END  DO 

DO  DAYN0W=1 , NUMDAYS 
N0W=0 

DO  J=l,6  !6  new  customers  in 
R=RAND(0) 

IF (R.LT. 0.237)  THEN 
N0W(1)=N0W(1)+1 
ELSE  IF  (R.LT. 0.597)  THEN 
NOW(2)=NOW(2)+1 
ELSE 

N0W(3)=N0W(3)+1 
END  IF 
END  DO  ! J 
DO  J=1 ,3 

NOW2(DAYNOW,J)=NOW(J) 

END  DO 

TOTAL=TOTAL+NOW 


!at  each  iteration,  schedule  all  you  can 
0FF=0 

DO  DAY =DAYN 0 W , LASTFIXED 
MAXWAIT=0 . 0 
DO  J=1 ,3 

REDUCE=MIN (FL0AT(N0W(J) ) , FLOAT (SCHED (DAY , J) ) ) 

NOW (J)-NOW(J) -REDUCE 

SCHED (DAY , J) =SCHED (DAY , J) -REDUCE 

WAIT (LASTFIXED-DAYNOW) =WAIT (DAY-DAYNOW) +REDUCE 

MAXWAIT=MAX (MAXWAIT, FLOAT (DAY-DAYNOW) ) 

OFF ( J ) =OFF ( J ) +SCHED (DAY , J ) 

END  DO 
END  DO 


!A11  are  scheduled  for  current  day.  Go  to  next  day. 
IF(N0W(1)+N0W(2)+N0W(3) .EQ.0)G0T0  30 

!A11  are  not  scheduled.  Choose  the  schedule  for  day  LASTFIXED+1 
SC0RE=0 
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TEMP=10000 
DO  J=1 ,5 
DO  1=1,3 

SCORE ( J ) =SCORE ( J) +3  *MAX (0 . 0 ,FLOAT(NOW(I)-POSS( J , I) ) ) 

SCORE ( J ) =SCORE ( J) +OFF ( I ) +MAX (0.0, FLOAT (POSS (J, I) -NOW (I))) 
END  DO 

SCORE (J)=SCORE(J) 

IF ( SCORE (J) .LT. TEMP) THEN 
PICK=J 

TEMP=SCORE( J) 

END  IF 
END  DO 

TCOST=TCOST+COST (PICK)  ! increment  total  cost 
20  LASTFIXED=LASTFIXED+ 1 

DO  J=l,3  ! implement  schedule  number  PICK  for  day  LASTFIXED 
SCHED (LASTFIXED , J) =P0SS (PICK , J) 

END  DO 

SCHED2 (LASTFIXED) =PICK 
GOTO  10 

30  END  DO  IDAYNOW 


!Done.  Record  results. 

WRITE(2,*) 

WRITE(2,*)  ’TOTAL  ARRIVALS:  ’ , (TOTAL(I) , 1=1 ,3) 
WRITE(2,*) 

WRITE(2,*) ’  DAY  ARRIVALS  SCHEDULE’ 

N0W=0 

DO  J=1 .NUMDAYS 

WRITE (2, 40) J, ’  ’ , (N0W2(J,I) ,1=1,3) , ’  ’,  (SCHED2(J)) 
40  FORMAT ( 13, A4, 313, A4, 13) 

DO  1=1,3 

N0W(I)=N0W(I)+SCHED( J , I) 

END  DO 
END  DO 
WRITE (2,*) 

WRITE ( 2 , * ) ’ EMPTY  SLOTS: ’, (NOW(I) ,1=1,3) 

WRITE(2,*) 

WRITE (2,*)  ’DAYS  TO  WAIT  #CUST’ 

WRITE (*,*)  ’DAYS  TO  WAIT  #CUST’ 

SUM=0 
DO  1=0,25 

WRITE(2,*)I,WAIT(I) 
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WRITE(*,*)I,WAIT(I) 

IF (WAIT (I) . GT.O)TEMP=I 
IF ( I . GT . 5 ) SUM=SUM+WAIT ( I ) 

END  DO 

WRITE (*,*)  ’Max  wait  was’, TEMP 
WRITE(*,*) ’Total  over  was ’.SUM,’  OR’, 

1  100 . 0*FL0AT(SUM)/6 .0/FL0AT(NUMDAYS)  ,  ”/.’ 

WRITE(* , *) 

WRITE(*,*) ’Cost  per  day  was  ’ ,FL0AT(TC0ST)/FL0AT(NUMDAYS) 

CLOSE(l) 

CLOSE (2) 

END 


|  ************************************ 


Input  file  COMBIN.DAT 

1  3  2  13.86 

213  12.01 

1  4  1  11.24 

330  4.21 

0  0  6  38.6 
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